-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: GH11206 where pd.isnull did not consider numpy NaT null #11212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d4a2acc to e937ebc Compare | lgtm. can you run a perf check on the null benchmarks. just confirm nothing has substantially changed. |
597d9d4 to d6b80ce Compare | Updated. Timing for I also changed the code slightly so that the logic is further short-circuited short-circuited. |
| the benchmark u are showing is 3x increase? |
| Yes.That's the with lots of nulls. I will dig a little to see if that can be avoided. |
| these routines are hit everywhere, so i am for special casing certain things. IOW, that is why |
d6b80ce to 97a5c6a Compare | Updated. |
pandas/lib.pyx Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to remove this. I will remove in the next update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are you fixing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean the _check_all_nulls? Only object arrays hit this path so I thought it'd be safe to have the most generic version of checknull here?
For the dtype thing, I forgot to remove it after some experiment, but will remove it in the next update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep that is fine
97a5c6a to 24d174b Compare | udpated |
doc/source/whatsnew/v0.17.0.txt Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's put this in 0.17.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls move to 0.17.1
| ok. I'm traveling and will get to this next week. |
| @kawochen can you update (move whatsnew to 0.17.1) |
| Will do (still on vacation)
|
24d174b to a0ca190 Compare | updated |
| can u rebase and run a perf check |
| Saw some horrifying perf hit (30X). I will look at the compiled code later. |
| ok gr8! |
| so using the perf hit is 4X. But if I that put into an |
| are you using |
| The perf hit is different for numpy 1.10 vs 1.9 (and asv probably uses the newest version) |
| @kawochen pushing, but if you finish in the next few days, pls ping |
a0ca190 to d38bd8a Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe blow away _checknull_with_nat(and replace with _check_all_nulls and see if any timings are affected?
| Replacing wouldn't work in all cases, because sometimes you don't want But I see that So maybe when others or I clean up this module, we can see how to organize or deal with all the lower level scalar check null functions |
BUG: GH11206 where pd.isnull did not consider numpy NaT null
| @kawochen thanks! always nice PR's from you! keep it up! |
closes #11206