Skip to content

ENH: Create dataframes using every combination of given values, like R's expand.grid() #7426

@onesandzeroes

Description

@onesandzeroes

I find R's expand.grid() function quite useful for quick creation of example datasets. For example:

expand.grid(height = seq(60, 70, 5), weight = seq(100, 180, 40), sex = c("Male","Female"))
 height weight sex 1 60 100 Male 2 65 100 Male 3 70 100 Male 4 60 140 Male 5 65 140 Male 6 70 140 Male 7 60 180 Male 8 65 180 Male 9 70 180 Male 10 60 100 Female 11 65 100 Female 12 70 100 Female 13 60 140 Female 14 65 140 Female 15 70 140 Female 16 60 180 Female 17 65 180 Female 18 70 180 Female 

A simple implementation of this for pandas is easy to put together:

def expand_grid(dct): rows = itertools.product(*dct.values()) return pd.DataFrame.from_records(rows, columns=dct.keys()) df = expand_grid( {'height': range(60, 71, 5), 'weight': range(100, 181, 40), 'sex': ['Male', 'Female']} ) print(df)

Do people think this would be a useful addition?

If so, what kind of features should it have beyond the basics? A dtypes argument, specifying which column should be the index, etc.?

I'm also not sure if expand_grid is the most intuitive name, but given that it's duplicating
R functionality, maybe it's best just to leave it as is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions