-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
ENH: Implement EA unary ops #23313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Implement EA unary ops #23313
Conversation
| Hello @sinhrks! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2019-06-30 19:12:01 UTC |
pandas/core/internals/blocks.py Outdated
| regex=regex, convert=convert) | ||
| | ||
| def apply(self, func, **kwargs): | ||
| # neg should be treated as inv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this? NumPy seemed to regret that decision, at least for boolean ndarrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is commonly used in internal also. Should be deprecated in 0.24?
pandas/core/arrays/base.py Outdated
| `op` cannot be stored in the ExtensionArray. The dtype of the | ||
| ndarray uses NumPy's normal inference rules. | ||
| Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be Examples with an s
pandas/core/generic.py Outdated
| raise TypeError("Unary negative expects numeric dtype, not {}" | ||
| .format(values.dtype)) | ||
| return self.__array_wrap__(arr) | ||
| new_data = self._data.apply(operator.neg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @jreback @jbrockmendel do we want to use Block.apply here? I don't recall what the current guidelines are for ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've gotten arithmetic/comparison ops out of internals. Haven't done anything to unary ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, I'm going to move the logic to ops.py immitating binary ops manner.
| # GH#23087 | ||
| ser = Series([1.1, 3.1, 2.1], index=range(3), dtype=typ) | ||
| | ||
| if op is operator.inv: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simpler to do operator.inv in a separate test and leave it out of the parameterize
pandas/core/ops.py Outdated
| return result.__finalize__(self) | ||
| else: | ||
| with np.errstate(all='ignore'): | ||
| new_data = op(self.values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be self._values to make sure it is called on the ExtensionArray ?
pandas/tests/arrays/test_integer.py Outdated
| class TestUnaryOps(object): | ||
| | ||
| @pytest.mark.parametrize('typ', ['Int64', 'Int32', 'Int16', 'Int8', | ||
| 'UInt64', 'UInt32', 'UInt16', 'UInt8']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a fixture defined for this at the top of the file you can reuse
6500c9b to 6f64b96 Compare pandas/core/ops.py Outdated
| return operator.inv(values) | ||
| except Exception: | ||
| # inv fails with 0 len | ||
| if not np.prod(values.shape): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values.size? Any more specific exception?
| @sinhrks can you rebase and will see where this PR is. |
a9fdbd8 to 31ae035 Compare | can you merge master and upate |
Codecov Report
@@ Coverage Diff @@ ## master #23313 +/- ## ========================================== - Coverage 92.38% 92.2% -0.18% ========================================== Files 166 162 -4 Lines 52321 51738 -583 ========================================== - Hits 48337 47706 -631 - Misses 3984 4032 +48
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@ ## master #23313 +/- ## ========================================= Coverage ? 91.73% ========================================= Files ? 178 Lines ? 50810 Branches ? 0 ========================================= Hits ? 46609 Misses ? 4201 Partials ? 0
Continue to review full report at Codecov.
|
346db01 to 5f1f1df Compare | Sorry to take a long. Updated tests to meet NumPy changes. One remaining error in Azure is caused by network connection which looks unrelated to the change. |
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really good @sinhrks
@TomAugspurger @jbrockmendel if you'd have a look. I am inclined for 0.24.0.
| # Unary | ||
| class TestFrameUnary(object): | ||
| | ||
| @pytest.mark.parametrize('typ', ['int64', 'int32', 'int16', 'int8', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can import these from pandas.conftest (we only have fixtures for the current int types, not the EA ones, though you might want to create one for combined int/EA types and use it here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Please suggest if there is better arg name.
| tm.assert_series_equal(-(ser < 0), ~(ser < 0)) | ||
| | ||
| @pytest.mark.parametrize('typ', ['int64', 'int32', 'int16', 'int8', | ||
| 'uint64', 'uint32', 'uint16', 'uint8', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
| ser = Series([pd.Timestamp('2011-01-01'), pd.Timestamp('2011-01-02'), | ||
| pd.Timestamp('2011-01-03')], index=range(3), dtype=typ) | ||
| # pandas all raises | ||
| pytest.raises(TypeError, op, ser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use the other form
with pytest.raises(TypeError): op(ser) | @sinhrks if you can move the whatsnew to 0.25.0 |
| can you merge master and update |
| @pandas-dev/pandas-core if anyone would like to rebase this, very close to being ready |
WillAyd left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK @jreback got this to green. Definitely a few rough edges but lmk what you think needs to happen to get this merged
| op(ser.values) | ||
| else: | ||
| result = op(ser) | ||
| if op is operator.pos and pd.compat.numpy._is_numpy_dev: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is going to hold up. w.r.t the linked issue:
That was milestoned for numpy 1.16 but interestingly only appears in dev with this change.
| | ||
| @pytest.fixture(params=['__pos__', '__neg__', '__inv__', '__invert__', | ||
| '__abs__']) | ||
| def all_unary_operators(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this have a name reflecting the fact that these are the strings and not operator.pos, operator.neg, etc?
| | ||
| # Note: TimedeltaIndex overrides this in call to cls._add_numeric_methods | ||
| def __pos__(self): | ||
| return self.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a copy or view?
| def __pos__(self): | ||
| raise TypeError("Unary plus expects numeric dtype, not {}" | ||
| .format(self.dtype)) | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for PeriodArray?
| | ||
| | ||
| # ---------------------------------------------------------------------- | ||
| # Unary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id be on board with an ops.unary file
| @sinhrks can you rebase |
| Nice PR but I think has gone stale so closing for now - ping if you'd like to pick this back up |
git diff upstream/master -u -- "*.py" | flake8 --diffChanged to apply unary funcs per blocks. As a result, unexpected dtype changes are also fixed (as below).