-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: NaN should have pct rank of NaN #22600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Hello @gfyoung! Thanks for submitting the PR.
|
| if pct: | ||
| for i in range(N): | ||
| out[i, 0] = out[i, 0] / grp_sizes[i, 0] | ||
| if out[i, 0] != out[i, 0] or out[i, 0] == NAN: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this all be simplified to just if out[i, 0] != NAN:?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think those are semantically equivalent. Did you mean out[i, 0] == NAN ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was only do the division if out[i, 0] != NAN otherwise leave as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, gotcha! 🙂 Let's try it and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd : Good idea, but unfortunately, the tests I added disagree with it. You need both conditionals when checking. Thus, this code needs to stay as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK I got you. Is it a particular type that's failing?
For some reason I thought some of the work @realead was doing was supposed to remove the need for comparisons like out[i, 0 ] != out[i, 0] to figure out if a value was NA though it's entirely possible I have misunderstood that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "right" handling of NA would only apply to algorithms using hash-map, which is here not the case here if I see it correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment on what is going on here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, done.
0ac5ae9 to 3f3f30b Compare
WillAyd left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment on whatsnew but otherwise lgtm
doc/source/whatsnew/v0.23.5.txt Outdated
| instance of :class:`~pandas.core.Index` was broken in `4efb39f | ||
| <https://github.com/pandas-dev/pandas/commit/4efb39f01f5880122fa38d91e12d217ef70fad9e>`_ (:issue:`22227`). | ||
| - Calling :meth:`DataFrameGroupBy.rank` and :meth:`SeriesGroupBy.rank` with empty groups | ||
| and ``pct=True`` was broken in `c1068d9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we typically reference commits where regressions occurred in whatsnew notes? Would think it better to just call out the ZeroDivisionError instead of the commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just following what was done above. This is a relatively new addition to whatsnew, but I don't any reason to buck this trend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Ultimately indifferent on the commit reference though I think calling out the ZeroDivisionError is much more useful of an indicator when either googling or looking at the whatsnew to see what has actually changed (rather than clicking through to issue or commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. I'll let the CI run its course and then make the addition.
3f3f30b to ad4166f Compare Codecov Report
@@ Coverage Diff @@ ## master #22600 +/- ## ======================================= Coverage 92.05% 92.05% ======================================= Files 169 169 Lines 50783 50783 ======================================= Hits 46749 46749 Misses 4034 4034
Continue to review full report at Codecov.
|
ad4166f to 5cffe39 Compare | Circle failure is related to Hypothesis timeout, which is not related to my PR. cc @jreback |
| if pct: | ||
| for i in range(N): | ||
| out[i, 0] = out[i, 0] / grp_sizes[i, 0] | ||
| if out[i, 0] != out[i, 0] or out[i, 0] == NAN: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment on what is going on here
5cffe39 to b904ec2 Compare | @jreback : Made the requested change and all is green. PTAL. |
| thanks @gfyoung I think this will backport cleanly. |
| Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon! If these instruction are inaccurate, feel free to suggest an improvement. |
| @jreback : The |
Closes #22519.