-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Description
I find R's expand.grid() function quite useful for quick creation of example datasets. For example:
expand.grid(height = seq(60, 70, 5), weight = seq(100, 180, 40), sex = c("Male","Female")) height weight sex 1 60 100 Male 2 65 100 Male 3 70 100 Male 4 60 140 Male 5 65 140 Male 6 70 140 Male 7 60 180 Male 8 65 180 Male 9 70 180 Male 10 60 100 Female 11 65 100 Female 12 70 100 Female 13 60 140 Female 14 65 140 Female 15 70 140 Female 16 60 180 Female 17 65 180 Female 18 70 180 Female A simple implementation of this for pandas is easy to put together:
def expand_grid(dct): rows = itertools.product(*dct.values()) return pd.DataFrame.from_records(rows, columns=dct.keys()) df = expand_grid( {'height': range(60, 71, 5), 'weight': range(100, 181, 40), 'sex': ['Male', 'Female']} ) print(df)Do people think this would be a useful addition?
If so, what kind of features should it have beyond the basics? A dtypes argument, specifying which column should be the index, etc.?
I'm also not sure if expand_grid is the most intuitive name, but given that it's duplicating
R functionality, maybe it's best just to leave it as is.
mayer79