Skip to content

Conversation

@jbrockmendel
Copy link
Member

This is in preparation for implementing the direct-pinning discussed here. The main roadblock to that is SparseSeries which calls ops.add_special_arithmetic_methods twice. This PR gets rid of the double-call, so the follow-up will be straight-forward.

A handful of cleanups are done here too:

  • unrelated docstring fix for TimedeltaIndex.replace
  • minor formatting improvements
  • remove an unused kwarg for DataFrame._compare_frame
  • remove a loop in add_flex_arithmetic_methods, replace with an assertion that it is unnecessary
  • fix names for some SparseArray method names, e.g. ATM SparseArray.__eq__.__name__ == 'eq'
is_self_int_dtype = is_integer_dtype(self.dtype)

fill_int = lambda x: x.fillna(0)
fill_bool = lambda x: x.fillna(False).astype(bool)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these outside of the function scope to clarify they don't need to be re-defined on each call. _bool_method_SERIES is the topic of a separate PR, but this is sufficiently small to include in assorted-cleanups.

@codecov
Copy link

codecov bot commented Feb 20, 2018

Codecov Report

Merging #19782 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #19782 +/- ## ========================================== + Coverage 91.61% 91.61% +<.01%  ========================================== Files 150 150 Lines 48887 48882 -5 ========================================== - Hits 44786 44782 -4  + Misses 4101 4100 -1
Flag Coverage Δ
#multiple 89.98% <100%> (ø) ⬆️
#single 41.79% <28.57%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/timedeltas.py 90.56% <ø> (ø) ⬆️
pandas/core/sparse/series.py 95.25% <ø> (-0.02%) ⬇️
pandas/core/ops.py 96.86% <100%> (+0.12%) ⬆️
pandas/core/sparse/frame.py 94.81% <100%> (ø) ⬆️
pandas/core/frame.py 97.22% <100%> (ø) ⬆️
pandas/core/sparse/array.py 91.46% <100%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af5e8ec...2b81a5c. Read the comment docs.

other = SparseArray(other, fill_value=self.fill_value,
dtype=dtype)
return _sparse_array_op(self, other, op, name)
return _sparse_array_op(self, other, op, name, series=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole sparse interaction is pretty hacky (I know not addressing this now), but....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, there are a couple of improvements in store for the pass after next (which is just implementing the single-call ops.add_whatever_methods(cls))

sparse_series_special_funcs = dict(arith_method=_arith_method_SPARSE_SERIES,
comp_method=_arith_method_SPARSE_SERIES,
bool_method=None)
bool_method=_bool_method_SERIES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we hold off adding till then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary to get rid of the double-call in sparse.series.

@jreback jreback added Sparse Sparse Data Type Clean labels Feb 20, 2018
@jreback jreback added this to the 0.23.0 milestone Feb 21, 2018
@jreback jreback merged commit 8382067 into pandas-dev:master Feb 21, 2018
@jreback
Copy link
Contributor

jreback commented Feb 21, 2018

thanks

@jbrockmendel jbrockmendel deleted the spops3 branch February 21, 2018 00:18
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Clean Sparse Sparse Data Type

2 participants