-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: numpy.split on non-UTC changes original time (#14042) #17255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Root cause: When a DatetimeIndex is created by __array_wrap__, round-trip of timezone occured Solution: Remove the timezone of the original index when creating a DatetimeIndex via __array_wrap__ and apply the timezone later
Codecov Report
@@ Coverage Diff @@ ## master #17255 +/- ## ========================================== - Coverage 91.03% 90.99% -0.05% ========================================== Files 162 162 Lines 49527 49533 +6 ========================================== - Hits 45086 45071 -15 - Misses 4441 4462 +21
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@ ## master #17255 +/- ## ========================================== - Coverage 91.24% 90.99% -0.25% ========================================== Files 163 162 -1 Lines 50091 49536 -555 ========================================== - Hits 45704 45074 -630 - Misses 4387 4462 +75
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.21.0.txt Outdated
| - Fixes ``DataFrame.loc`` for setting with alignment and tz-aware ``DatetimeIndex`` (:issue:`16889`) | ||
| - Avoids ``IndexError`` when passing an Index or Series to ``.iloc`` with older numpy (:issue:`17193`) | ||
| - Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`) | ||
| - Fixes bug for ``numpy.split`` on pandas dataframe with non-utc timezone (:issue:`14042`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug in numpy array_wrap with a pandas Series/Index.
pandas/core/indexes/base.py Outdated
| attrs = self._get_attributes_dict() | ||
| attrs = self._maybe_update_attributes(attrs) | ||
| return Index(result, **attrs) | ||
| from pandas.core.indexes.datetimes import DatetimeIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this
diff --git a/pandas/core/indexes/base.py b/pandas/core/indexes/base.py index de6221987..865e92201 100644 --- a/pandas/core/indexes/base.py +++ b/pandas/core/indexes/base.py @@ -99,6 +99,9 @@ def _new_Index(cls, d): if issubclass(cls, ABCPeriodIndex): from pandas.core.indexes.period import _new_PeriodIndex return _new_PeriodIndex(cls, **d) + elif issubclass(cls, ABCDatetimeIndex): + from pandas.core.indexes.datetimes import _new_DatetimeIndex + return _new_DatetimeIndex(cls, **d) return cls.__new__(cls, **d) then you can simply call
return _new_Index(result attrs)
| assert result == expected | ||
| | ||
| def test_split_date_range_with_timezone(self): | ||
| # https://github.com/pandas-dev/pandas/issues/14042 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a 1-liner about what you are testing here
| # https://github.com/pandas-dev/pandas/issues/14042 | ||
| idx = DatetimeIndex(['2016-01-01 00:00:00', '2016-01-02 00:00:00'], | ||
| tz='Asia/Seoul') | ||
| split = np.split(idx, indices_or_sections=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use an tm.assert_index_equal and test for each piece.
Also test using a Series (can be here is ok)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we could exercise more numpy ufuncs for 1d reshaping ops here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that Series does not suffer from this issue. But I'll leave the testcase.
| Hello @wooyekim! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on October 29, 2017 at 14:37 Hours UTC |
| attrs = self._get_attributes_dict() | ||
| attrs = self._maybe_update_attributes(attrs) | ||
| return Index(result, **attrs) | ||
| from pandas.core.dtypes.generic import ABCDatetimeIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if you saw my comment. pls move this logic to _new_Index (and the import can be at the top of the file)
| can you rebase / update |
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rebase
| - Bug in reindexing on an empty ``CategoricalIndex`` (:issue:`16770`) | ||
| - Fixes ``DataFrame.loc`` for setting with alignment and tz-aware ``DatetimeIndex`` (:issue:`16889`) | ||
| - Avoids ``IndexError`` when passing an Index or Series to ``.iloc`` with older numpy (:issue:`17193`) | ||
| - Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.21.1
Root cause: When a DatetimeIndex is created by array_wrap,
round-trip of timezone occured
Solution: Remove the timezone of the original index when creating
a DatetimeIndex via array_wrap and apply the timezone later
git diff upstream/master -u -- "*.py" | flake8 --diff