Skip to content

Conversation

@pandres
Copy link
Contributor

@pandres pandres commented Mar 6, 2018

After asking for feedback, this is the very minimal docstring that we interpret that would be acceptable.

This is in order to unlock the full list of functions we have assigned for the pandas documentation sprint.

If there is something to be improved here please tell us. Also we may make a new push of this documentation including the full scope of the args and kwargs parameters and more examples.

@pep8speaks
Copy link

pep8speaks commented Mar 6, 2018

Hello @pandres! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 09, 2018 at 16:01 Hours UTC
@jorisvandenbossche jorisvandenbossche changed the title Add scaffold docstring for groupby.DataFrameGroupBy.resample. DOC: improve groupby.DataFrameGroupBy.resample docstring Mar 6, 2018
Downsample the series into 3 minute bins and sum the values
of the timestamps falling into a bin.
>>> series.resample('3T').sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is resample on a groupby object, so the examples should also show that. See here for some explanation of this: https://pandas.pydata.org/pandas-docs/stable/groupby.html#new-syntax-to-window-and-resample-operations

value in the resampled bucket with the label ``2000-01-01 00:03:00``
does not include 3 (if it did, the summed value would be 6, not 3).
To include this value close the right side of the bin interval as
illustrated in the example below this one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General feedback: best to try to keep this text shorter.
And related to my comment above: I don't think you need to explain the details here of the resample method (for that you can refer to DataFrame.resample and include a "See Also" section), but rather how it is used together with groupby.

@jorisvandenbossche
Copy link
Member

Ah, OK, I see you actually copied the examples from the Series/DataFrame.resample docstring, that's why they were off topic :)
(but, best to only open a PR when you actually have changes, work in progress is certainly fine, but best that they are already relevant)

@pandres
Copy link
Contributor Author

pandres commented Mar 6, 2018

@jorisvandenbossche, thank you for the feedback. Maybe I rushed here, we were just doing a warm-up for the sprint on Saturday.

We do have questions though, related with the Series and DataFrames examples. I'll check your comment in #20020 (comment)

@jorisvandenbossche
Copy link
Member

If something is unclear, I think it is best to ask on the gitter channel

@codecov
Copy link

codecov bot commented Mar 9, 2018

Codecov Report

Merging #20020 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@ Coverage Diff @@ ## master #20020 +/- ## ========================================== + Coverage 91.67% 91.67% +<.01%  ========================================== Files 150 150 Lines 49096 49096 ========================================== + Hits 45007 45009 +2  + Misses 4089 4087 -2
Flag Coverage Δ
#multiple 90.05% <ø> (ø) ⬆️
#single 41.87% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/groupby.py 92.14% <ø> (ø) ⬆️
pandas/util/testing.py 83.85% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e02f737...d9331ff. Read the comment docs.

@pandres
Copy link
Contributor Author

pandres commented Mar 9, 2018

Not sure what's going on with the circleci failures.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments, but please:

  • Make sure you read the documentation (it explains the next points there, among other things)
  • Make sure somebody else (not the author) understands what the method does, and find this page useful. The (local) reviewer should try to imagine that she/he is a pandas user who wants to use this method. If this page doesn't help, is confusing... Fix it until it's "perfect".
  • Run the validate_docstrings.py script, and make sure you don't have any of the obvious mistake the script detects.
"""Provide resampling when using a TimeGrouper.
Return a new grouper with our resampler appended
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the original description, but the goal of the sprint is to make it better. After reading these texts, it's still difficult for me to understand what this method is about. In which cases it'll be useful for me.

def resample(self, rule, *args, **kwargs):
"""
Provide resampling when using a TimeGrouper
"""Provide resampling when using a TimeGrouper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short summary goes to the next line. Did you run the ./validate_docstrings.py script with it? It should let you detect formatting errors like this, and save time from reviewers, on things that a script can tell you.

--------
Start by creating a DataFrame with 9 one minute timestamps.
>>> index = pd.date_range('1/1/2000', periods=9, freq='T')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that we leave a blank line between the text and the code. Does it render all right in the html?

The convention is to use idx instead of index.

2000-01-01 00:07:00 0 1 2 3
2000-01-01 00:08:00 0 1 2 3
>>> series = pd.Series(range(9), index=index) # delete this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is to use s instead of series, you can check these in the documentation guide. # delete this is because you forgot to delete this? it doesn't seem to add value this line.

Parameters
----------
rule : str
The offset string or object representing target conversion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit cryptic. I think we can find a better explanation. And I'm sure there is somewhere in the pandas documentation where it explains which are the different values, we could link it. Also we could provide an example.

function.
**kwargs
These parameters will be passed to the get_resampler_for_grouping
function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to do some research on what get_resampler_for_grouping does with them, which can be used, and what's the impact on this method.

@pandres pandres closed this Mar 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants