Skip to content

Conversation

@datapythonista
Copy link
Member

@jreback
Copy link
Contributor

jreback commented Mar 1, 2018

I thought we had an issue about this already, IIRC @mrocklin either made it or commented (was for a slightly different purpose though).

@jreback
Copy link
Contributor

jreback commented Mar 1, 2018

We already have lots of data constructors in pandas.util.testing, though these are 'nicer' ones. These would need testing (e.g. do they run), and can be de-privatized (no leading _).

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Mar 1, 2018

I thought we had an issue about this already, IIRC @mrocklin either made it or commented (was for a slightly different purpose though).

I don't know if there is an older issue, but this is related to what is recently discussed in #19710 as @datapythonista linked to.

We already have lots of data constructors in pandas.util.testing, though these are 'nicer' ones.

Indeed, I don't think the ones in util.testing are suitable for this, as the exact purpose of those are to be 'nicer' relate-able small dataframes.

@datapythonista
Copy link
Member Author

Any more thoughts on this? Knowing which data to use for the examples in the docstrings is the main blocker for the sprint. Any feedback on how you think we can improve this first draft is highly appreciated. Thanks!

@jorisvandenbossche
Copy link
Member

This needs to be imported in some __init__ files, as otherwise you cannnot do pd.io.samples. .... Or what would be the intended use in the docs?

import pandas


def _countries_with_penguins():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these need to be de-privatized (no leading _)

columns = ('Code', 'Name', 'Capital', 'Continent',
'Penguin species', 'Avg. temperature')
data = [
('AO', 'Angola', 'Luanda', 'AF', 1, 21.55),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these need tests

@datapythonista
Copy link
Member Author

For what has been discussed in #19710, seems like it probably makes more sense to simply have some ideas on data to be used, but use custom datasets as simple and illustrative as possible depending on each case. So, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 participants