-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
PRF: Optimize 2d take operations. #6759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
| yep...I tried to do the same but on the cython level (and using memoryviews...but was complicated)... this could be a simpler soln. pls add a couple of vbenches that exercise this if you can by doing a take then other operations. e.g. reindex, then do stuff with that (on both axes). |
| Is there an optimal way to run Maybe set n to like 100 :/ |
|
|
| @jreback are we able to use memoryviews? Not sure what version of cython they were introduced. |
| yes |
| FYI pandas already requires cython 0.17.1 or higher. |
| can you post another vbench run after your latest changes |
reran the bottom |
| ok looks fine |
| @dalejung this was ok...pls put in a release note and rebase...good to 2 go otherwise |
| @dalejung just needed a release note on this |
| merged via 36357a0 (I added a release note as well) thanks! |
Wanted to get eyes on this. The original axis=1 2d methods were optimized for Fortran. I changed it to be friendly to C order and transposed the array when F ordered.
These timings are for master.
cdfis just the consolidated dataframe to switch from C to F.http://nbviewer.ipython.org/gist/dalejung/9914956/pandas%20perf_baseline.ipynb
The PR timings are:
http://nbviewer.ipython.org/gist/dalejung/9914956/pandas%20perf.ipynb
It's similar behavior to the shift perf update.
For a numpy baseline i wrote:
Which are similar timings.