Skip to content

Conversation

@sinhrks
Copy link
Member

@sinhrks sinhrks commented Oct 24, 2018

Changed to apply unary funcs per blocks. As a result, unexpected dtype changes are also fixed (as below).

# Current master (-pd.DataFrame({'a': [1], 'b': [1.1]})).dtypes # a float64 # b float64 # dtype: object # After the PR (-pd.DataFrame({'a': [1], 'b': [1.1]})).dtypes # a int64 # b float64 # dtype: object 
@pep8speaks
Copy link

pep8speaks commented Oct 24, 2018

Hello @sinhrks! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 53:80: E501 line too long (83 > 79 characters)

Comment last updated at 2019-06-30 19:12:01 UTC
regex=regex, convert=convert)

def apply(self, func, **kwargs):
# neg should be treated as inv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this? NumPy seemed to regret that decision, at least for boolean ndarrays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is commonly used in internal also. Should be deprecated in 0.24?

`op` cannot be stored in the ExtensionArray. The dtype of the
ndarray uses NumPy's normal inference rules.
Example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be Examples with an s

raise TypeError("Unary negative expects numeric dtype, not {}"
.format(values.dtype))
return self.__array_wrap__(arr)
new_data = self._data.apply(operator.neg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jreback @jbrockmendel do we want to use Block.apply here? I don't recall what the current guidelines are for ops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've gotten arithmetic/comparison ops out of internals. Haven't done anything to unary ops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, I'm going to move the logic to ops.py immitating binary ops manner.

@TomAugspurger TomAugspurger added Numeric Operations Arithmetic, Comparison, and Logical operations ExtensionArray Extending pandas with custom dtypes or arrays. labels Oct 25, 2018
# GH#23087
ser = Series([1.1, 3.1, 2.1], index=range(3), dtype=typ)

if op is operator.inv:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simpler to do operator.inv in a separate test and leave it out of the parameterize

return result.__finalize__(self)
else:
with np.errstate(all='ignore'):
new_data = op(self.values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be self._values to make sure it is called on the ExtensionArray ?

class TestUnaryOps(object):

@pytest.mark.parametrize('typ', ['Int64', 'Int32', 'Int16', 'Int8',
'UInt64', 'UInt32', 'UInt16', 'UInt8'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a fixture defined for this at the top of the file you can reuse

@sinhrks sinhrks force-pushed the unary branch 3 times, most recently from 6500c9b to 6f64b96 Compare October 30, 2018 10:05
return operator.inv(values)
except Exception:
# inv fails with 0 len
if not np.prod(values.shape):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values.size? Any more specific exception?

@jreback
Copy link
Contributor

jreback commented Dec 3, 2018

@sinhrks can you rebase and will see where this PR is.

@sinhrks sinhrks force-pushed the unary branch 2 times, most recently from a9fdbd8 to 31ae035 Compare December 6, 2018 05:31
@jreback
Copy link
Contributor

jreback commented Dec 27, 2018

can you merge master and upate

@codecov
Copy link

codecov bot commented Jan 11, 2019

Codecov Report

Merging #23313 into master will decrease coverage by 0.17%.
The diff coverage is 98.7%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #23313 +/- ## ========================================== - Coverage 92.38% 92.2% -0.18%  ========================================== Files 166 162 -4 Lines 52321 51738 -583 ========================================== - Hits 48337 47706 -631  - Misses 3984 4032 +48
Flag Coverage Δ
#multiple 90.6% <98.7%> (-0.21%) ⬇️
#single 43.05% <57.14%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/arrays/base.py 97.04% <100%> (-1.22%) ⬇️
pandas/core/frame.py 96.91% <100%> (-0.02%) ⬇️
pandas/core/generic.py 96.57% <100%> (-0.06%) ⬇️
pandas/core/arrays/integer.py 95.66% <100%> (-0.66%) ⬇️
pandas/core/arrays/sparse.py 91.84% <100%> (-0.33%) ⬇️
pandas/core/ops.py 94.4% <97.29%> (+0.11%) ⬆️
pandas/io/s3.py 0% <0%> (-86.37%) ⬇️
pandas/io/sas/sasreader.py 86.2% <0%> (-9.95%) ⬇️
pandas/io/parquet.py 76.92% <0%> (-7.7%) ⬇️
... and 77 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d46d2a5...15fef47. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 11, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@734b6d1). Click here to learn what that means.
The diff coverage is 98.73%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #23313 +/- ## ========================================= Coverage ? 91.73% ========================================= Files ? 178 Lines ? 50810 Branches ? 0 ========================================= Hits ? 46609 Misses ? 4201 Partials ? 0
Flag Coverage Δ
#multiple 90.32% <98.73%> (?)
#single 41.25% <60.75%> (?)
Impacted Files Coverage Δ
pandas/core/arrays/base.py 99.47% <100%> (ø)
pandas/core/arrays/timedeltas.py 88.86% <100%> (ø)
pandas/core/frame.py 96.88% <100%> (ø)
pandas/core/arrays/datetimes.py 97.8% <100%> (ø)
pandas/core/generic.py 93.98% <100%> (ø)
pandas/core/arrays/integer.py 96.4% <100%> (ø)
pandas/core/arrays/sparse.py 93.48% <100%> (ø)
pandas/core/ops.py 94.8% <97.14%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 734b6d1...536d002. Read the comment docs.

@sinhrks sinhrks force-pushed the unary branch 5 times, most recently from 346db01 to 5f1f1df Compare January 15, 2019 08:14
@sinhrks
Copy link
Member Author

sinhrks commented Jan 15, 2019

Sorry to take a long. Updated tests to meet NumPy changes.

One remaining error in Azure is caused by network connection which looks unrelated to the change.

@sinhrks sinhrks changed the title WIP: ENH: Implement EA unary ops ENH: Implement EA unary ops Jan 15, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really good @sinhrks

@TomAugspurger @jbrockmendel if you'd have a look. I am inclined for 0.24.0.

# Unary
class TestFrameUnary(object):

@pytest.mark.parametrize('typ', ['int64', 'int32', 'int16', 'int8',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can import these from pandas.conftest (we only have fixtures for the current int types, not the EA ones, though you might want to create one for combined int/EA types and use it here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Please suggest if there is better arg name.

tm.assert_series_equal(-(ser < 0), ~(ser < 0))

@pytest.mark.parametrize('typ', ['int64', 'int32', 'int16', 'int8',
'uint64', 'uint32', 'uint16', 'uint8',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

ser = Series([pd.Timestamp('2011-01-01'), pd.Timestamp('2011-01-02'),
pd.Timestamp('2011-01-03')], index=range(3), dtype=typ)
# pandas all raises
pytest.raises(TypeError, op, ser)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the other form

with pytest.raises(TypeError): op(ser) 
@jreback jreback added this to the 0.24.0 milestone Jan 16, 2019
@jreback
Copy link
Contributor

jreback commented Jan 26, 2019

@sinhrks if you can move the whatsnew to 0.25.0

@jreback
Copy link
Contributor

jreback commented Apr 5, 2019

can you merge master and update

@jreback
Copy link
Contributor

jreback commented Jun 8, 2019

@pandas-dev/pandas-core if anyone would like to rebase this, very close to being ready

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK @jreback got this to green. Definitely a few rough edges but lmk what you think needs to happen to get this merged

op(ser.values)
else:
result = op(ser)
if op is operator.pos and pd.compat.numpy._is_numpy_dev:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this is going to hold up. w.r.t the linked issue:

numpy/numpy#11450

That was milestoned for numpy 1.16 but interestingly only appears in dev with this change.


@pytest.fixture(params=['__pos__', '__neg__', '__inv__', '__invert__',
'__abs__'])
def all_unary_operators(request):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this have a name reflecting the fact that these are the strings and not operator.pos, operator.neg, etc?


# Note: TimedeltaIndex overrides this in call to cls._add_numeric_methods
def __pos__(self):
return self.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a copy or view?

def __pos__(self):
raise TypeError("Unary plus expects numeric dtype, not {}"
.format(self.dtype))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for PeriodArray?



# ----------------------------------------------------------------------
# Unary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id be on board with an ops.unary file

@jbrockmendel
Copy link
Member

@sinhrks can you rebase

@WillAyd
Copy link
Member

WillAyd commented Aug 28, 2019

Nice PR but I think has gone stale so closing for now - ping if you'd like to pick this back up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ExtensionArray Extending pandas with custom dtypes or arrays. Numeric Operations Arithmetic, Comparison, and Logical operations

7 participants