-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: silent overflow in DateTimeArray subtraction #22508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Hi @jbrockmendel , thanks for the follow-up.
|
Codecov Report
@@ Coverage Diff @@ ## master #22508 +/- ## ========================================= Coverage ? 92.03% ========================================= Files ? 169 Lines ? 50780 Branches ? 0 ========================================= Hits ? 46737 Misses ? 4043 Partials ? 0
Continue to review full report at Codecov.
|
| @shengpu1126 : You should make the changes so that it gets the expected output that you reported in the original issue. And yes, that does seem like a good place to put the test. Don't forget a |
| @gfyoung it isn’t that easy. If you look at the scalar op in the original example, it returns a stdlib timedelta, not a pd.Timedelta. @shengpu1126 i think raising overflowerror is correct, will doublecheck. |
@jbrockmendel : Ah, I see. My response was motivated by the fact that you didn't comment on @shengpu1126 expected output in the original issue, which I took to be a tacit agreement of the expected behavior for patching this bug. |
| Test will go in tests.arithmetic.test_datetime64 |
| Hello @shengpu1126! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on August 30, 2018 at 15:56 Hours UTC |
doc/source/whatsnew/v0.24.0.txt Outdated
| - Bug in :class:`DataFrame` comparisons against ``Timestamp``-like objects failing to raise ``TypeError`` for inequality checks with mismatched types (:issue:`8932`,:issue:`22163`) | ||
| - Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`) | ||
| - Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`) | ||
| - Bug in :class:`DatetimeIndex` subtraction that incorrectly failed to raise `OverflowError` (:issue:22492, :issue:22508) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a little more here, e.g. when this fails to raise
| dtimin - variant | ||
| | ||
| def test_datetimeindex_sub_datetimeindex_overflow(self): | ||
| dtimax = pd.to_datetime(['now', pd.Timestamp.max]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment with the issue number
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments. ping on green (or at least except for 3.6/3.5 builds are currently failing to an unrelated error). so might want to wait a little bit to rebase.
doc/source/whatsnew/v0.24.0.txt Outdated
| - Bug in :class:`DataFrame` comparisons against ``Timestamp``-like objects failing to raise ``TypeError`` for inequality checks with mismatched types (:issue:`8932`,:issue:`22163`) | ||
| - Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`) | ||
| - Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`) | ||
| - Bug in :class:`DatetimeIndex` subtraction that incorrectly failed to raise `OverflowError` when the difference between :class:`Timestamp`s (a :class:`Timedelta` object) exceeds `int64` limit (:issue:`22492`, :issue:`22508`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now confusing. let's go back to the prior one;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry about that!
for the record, it was mentioned here that pd.Timedeltas are represented as nanoseconds using 64 bit integers, and I believe that's the source of potential overflow.
| dtimax - ts_neg | ||
| | ||
| expected = pd.Timestamp.max.value - ts_pos[1].value | ||
| res = dtimax - ts_pos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use result (rather than res) in all occurrences
| assert res[1].value == expected | ||
| | ||
| with pytest.raises(OverflowError): | ||
| dtimin - ts_pos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment or 2 to delineate the various cases
| thanks! |
git diff upstream/master -u -- "*.py" | flake8 --diff