-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
ENH: add quantile method to resample #22304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@ ## master #22304 +/- ## ========================================= Coverage ? 92.05% ========================================= Files ? 169 Lines ? 50716 Branches ? 0 ========================================= Hits ? 46685 Misses ? 4031 Partials ? 0
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.24.0.txt Outdated
| ``SeriesGroupBy`` when the grouping variable only contains NaNs and numpy version < 1.13 (:issue:`21956`). | ||
| - Multiple bugs in :func:`pandas.core.Rolling.min` with ``closed='left'` and a | ||
| datetime-like index leading to incorrect results and also segfault. (:issue:`21704`) | ||
| - Added :meth:`Resampler.quantile` (:issue:`15023`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to other enhancements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be added in api.rst as well
| def quantile(self, q=0.5, **kwargs): | ||
| """ | ||
| Return value at the given quantile. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a versionadded tag
| | ||
| # The various methods we support | ||
| downsample_methods = ['min', 'max', 'first', 'last', 'sum', 'mean', 'sem', | ||
| 'median', 'prod', 'var', 'ohlc'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the test from the OP which compares 2 methods of getting the same result
c0e8601 to b64bd34 Compare | Hello @discort! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on August 22, 2018 at 10:44 Hours UTC |
b64bd34 to 07ed27f Compare | Parameters | ||
| ---------- | ||
| q : float or array-like, default 0.5 (50% quantile) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some See Also and point to the Series / groupby versions
pandas/tests/test_resample.py Outdated
| res = df['timestamp'].resample('2D').first() | ||
| tm.assert_series_equal(res, exp) | ||
| | ||
| def test_resample_quantile(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test on timedelta / period as well. you might want to move to the Base class to make this easier.
doc/source/whatsnew/v0.24.0.txt Outdated
| - :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`) | ||
| - :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`) | ||
| - :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`) | ||
| - Added :meth:`Resampler.quantile` (:issue:`15023`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you say that Series.resample and DataFrame.resample have gained the Resample.quantile() method (use references for .resample)
pandas/core/resample.py Outdated
| how = self._is_cython_func(how) or how | ||
| ax = self.ax | ||
| | ||
| # Empty groupby objs are accepted only callable funcs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what triggers this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm this is odd. can you remove this fix, and instead: create a new issue which demonstrates this (is this only with this method?) and xfail the test. I think something bigger is going on here. We can merge this then come back with a battery of tests for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to more precisely, this block doesn't pass the check and raises exception.
It became to fail after adding kwargs to _groupby_and_aggregate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this is old logic. can you try calling
def _try_aggregate_string_function(self, arg, *args, **kwargs) (its defined in the base class) and see if this works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't
.../pandas/pandas/core/groupby/generic.py in _cython_agg_general(self, how, alt, numeric_only, min_count) 74 min_count=-1): 75 new_items, new_blocks = self._cython_agg_blocks( ---> 76 how, alt=alt, numeric_only=numeric_only, min_count=min_count) 77 return self._wrap_agged_blocks(new_items, new_blocks) 78 .../pandas/pandas/core/groupby/generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count) 147 148 if len(new_blocks) == 0: --> 149 raise DataError('No numeric types to aggregate') 150 151 # reset the locs in the blocks to correspond to our DataError: No numeric types to aggregate There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, ok, then let's just xfail this test and pls open another issue. I think this is a bigger issues and needs to be addressed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/core/resample.py Outdated
| how = self._is_cython_func(how) or how | ||
| ax = self.ax | ||
| | ||
| # Empty groupby objs are accepted only callable funcs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm this is odd. can you remove this fix, and instead: create a new issue which demonstrates this (is this only with this method?) and xfail the test. I think something bigger is going on here. We can merge this then come back with a battery of tests for this.
pandas/core/resample.py Outdated
| ax = self.ax | ||
| | ||
| # Empty groupby objs are accepted only callable funcs | ||
| # with arg or kwargs currently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so remove this fix here
| expected = series.resample(freq).apply(np.sum) | ||
| | ||
| assert_series_equal(result, expected, check_dtype=False) | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then xfail this test for Period, in your followup you can unxfail it.
OR you can do everything here is ok too. (close the new PR / issue). I guess this was simpler that it looked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second way looks simpler. So I closed new PR/issue
3f831f2 to 6f4f96f Compare
jreback left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment on the doc-string, ping on green.
pandas/core/resample.py Outdated
| DataFrame.quantile | ||
| DataFrameGroupBy.quantile | ||
| .. versionadded:: 0.24.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tag goes right below the summary line (the first line of the doc-string, with a blank line before and after)
6f4f96f to 92b5b2d Compare | thanks @discort keep em coming! |
git diff upstream/master -u -- "*.py" | flake8 --diff