F/rolling size #24111

bourbaki · 2018-12-05T10:58:30Z

closes DataFrame.rolling.size not implemented and errors in aggregation #24057
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pep8speaks · 2018-12-05T10:58:33Z

Hello @Ma3aXaKa! Thanks for updating the PR.

There are no PEP8 issues in the file pandas/core/window.py !
In the file pandas/tests/test_window.py, following are the PEP8 issues :

Line 656:60: E261 at least two spaces before inline comment
Line 656:80: E501 line too long (88 > 79 characters)

Comment last updated on December 07, 2018 at 17:52 Hours UTC

bourbaki · 2018-12-05T11:00:22Z

The only question I have is how to properly document an alias like that?

codecov · 2018-12-05T11:33:14Z

Codecov Report

Merging #24111 into master will decrease coverage by <.01%.
The diff coverage is 75%.

@@ Coverage Diff @@ ## master #24111 +/- ## ========================================== - Coverage 92.21% 92.2% -0.01%  ========================================== Files 161 161 Lines 51684 51688 +4 ========================================== + Hits 47658 47660 +2  - Misses 4026 4028 +2

Flag	Coverage Δ
#multiple	`90.6% <75%> (-0.01%)`	⬇️
#single	`43% <75%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/window.py	`96.29% <75%> (-0.11%)`	⬇️
pandas/io/json/json.py	`92.61% <0%> (-0.48%)`	⬇️
pandas/util/testing.py	`87.48% <0%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b5f4d1...45f2bf8. Read the comment docs.

codecov · 2018-12-05T11:33:14Z

Codecov Report

Merging #24111 into master will decrease coverage by 0.01%.
The diff coverage is 36.36%.

@@ Coverage Diff @@ ## master #24111 +/- ## ========================================== - Coverage 92.2% 92.19% -0.02%  ========================================== Files 162 162 Lines 51700 51711 +11 ========================================== + Hits 47670 47674 +4  - Misses 4030 4037 +7

Flag	Coverage Δ
#multiple	`90.59% <36.36%> (-0.02%)`	⬇️
#single	`43.01% <18.18%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/window.py	`95.61% <36.36%> (-0.79%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f492be6...9a28452. Read the comment docs.

jreback

this is not the right way - this should be setup in base class (where groupby and resample also inherit)

.size is not the same as .count either - it’s the size of the window where
count is the size - min_count

bourbaki · 2018-12-05T12:15:19Z

Thanks for the feedback. Maybe this is stupid question but could you give an example where size and count should give different results?
And what base class do you mean exactly? Looking at the class diagram the most "promising" one is pandas.core.base.SelectionMixin

bourbaki · 2018-12-05T12:35:02Z

@TomAugspurger Do you think I need to implement roll_size function? I am looking into _libs/window.pyx and there are Cython implementations of basic aggregate functions for rolling windows.

TomAugspurger · 2018-12-05T12:41:01Z

Perhaps... Though I size will be equivalent to the length of each window, right? I'm familiar with this section of the code base, but you may be able to find a way to get that information without having to write any Cython.

…

On Wed, Dec 5, 2018 at 6:35 AM Artem Bogachev ***@***.***> wrote: @TomAugspurger <https://github.com/TomAugspurger> Do you think I need to implement roll_size function? I am looking into _libs/window.pyx and there are Cython implementation of basic aggregate functions on windows. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24111 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHIlXVihr6lTIB2NAqZ7VqFyTpvhYIks5u1718gaJpZM4ZCfKv> .

WillAyd · 2018-12-05T18:47:25Z

roll_size is only used when dealing with frequency windows otherwise there's a count method in one of the base classes you could use for reference. Your test case should also take into account various combinations of arguments to the Window

jreback · 2018-12-06T13:31:51Z

so looking at this again, I think that this is almost exactly count, but counts all values rather than notna ones. So prob easiest to move count to _count.() and parameterize this to count na's via a keyword (skipna=True), then both size and count can call this.

jorisvandenbossche · 2018-12-06T21:36:50Z

Maybe this is stupid question but could you give an example where size and count should give different results?

size gives the full size of the group/window (the number of rows), while count is a count of the non-missing values. So once you have missing values, there will be different results.

To give a concrete example with groupby, as there both are implemented:

In [12]: df = pd.DataFrame({'a': [1, 1, 1, 2, 2, 2], 'b':[1, 2, 3, np.nan, 5, 6]}) In [13]: df.groupby('a')['b'].count() Out[13]: a 1 3 2 2 Name: b, dtype: int64 In [14]: df.groupby('a')['b'].size() Out[14]: a 1 3 2 3 Name: b, dtype: int64

bourbaki · 2018-12-07T17:37:51Z

I have pushed the solution I came up with. Could roll_size be redundant?

bourbaki · 2018-12-10T10:06:05Z

@jreback Is this approach ok?

WillAyd · 2018-12-10T16:01:57Z

Would rather see something that integrates into the existing count function as noted above by myself and Jeff. roll_count is only used with certain arguments so emulating via roll_size isn't the right way to go about this

jreback · 2019-01-05T15:36:46Z

if you want to update according to comment, pls ping.

bourbaki added 2 commits December 5, 2018 11:52

Add size alias for count

25a0e7e

whatsnew + formatting

45f2bf8

jreback requested changes Dec 5, 2018

View reviewed changes

WillAyd added the Window rolling, ewma, expanding label Dec 5, 2018

jreback added the Enhancement label Dec 6, 2018

WIP: add size functions

9a28452

bourbaki force-pushed the f/rolling-size branch from 636c086 to 9a28452 Compare December 7, 2018 17:52

jreback closed this Jan 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

F/rolling size #24111

F/rolling size #24111

Uh oh!

bourbaki commented Dec 5, 2018 •

edited

Loading

pep8speaks commented Dec 5, 2018 •

edited

Loading

bourbaki commented Dec 5, 2018

codecov bot commented Dec 5, 2018

codecov bot commented Dec 5, 2018 •

edited

Loading

jreback left a comment

bourbaki commented Dec 5, 2018

bourbaki commented Dec 5, 2018 •

edited

Loading

TomAugspurger commented Dec 5, 2018 via email

WillAyd commented Dec 5, 2018

jreback commented Dec 6, 2018

jorisvandenbossche commented Dec 6, 2018

bourbaki commented Dec 7, 2018

bourbaki commented Dec 10, 2018

WillAyd commented Dec 10, 2018

jreback commented Jan 5, 2019

Labels

6 participants

Uh oh!

F/rolling size #24111

F/rolling size #24111

Uh oh!

Conversation

bourbaki commented Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pep8speaks commented Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on December 07, 2018 at 17:52 Hours UTC

bourbaki commented Dec 5, 2018

codecov bot commented Dec 5, 2018

Codecov Report

codecov bot commented Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

jreback left a comment

Choose a reason for hiding this comment

bourbaki commented Dec 5, 2018

bourbaki commented Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TomAugspurger commented Dec 5, 2018 via email

WillAyd commented Dec 5, 2018

jreback commented Dec 6, 2018

jorisvandenbossche commented Dec 6, 2018

bourbaki commented Dec 7, 2018

bourbaki commented Dec 10, 2018

WillAyd commented Dec 10, 2018

jreback commented Jan 5, 2019

Labels

6 participants

bourbaki commented Dec 5, 2018 •

edited

Loading

pep8speaks commented Dec 5, 2018 •

edited

Loading

codecov bot commented Dec 5, 2018 •

edited

Loading

bourbaki commented Dec 5, 2018 •

edited

Loading