Make new column based on groupby calculation

Question

I have the following pandas DataFrame:

df = pd.DataFrame({ "category": ["one", "one", "one", "one", "two", "two", "two", "three", "three", "three"], "value": [2, 4, 3, 2, 5, 6, 5, 7, 8, 6] }) >>> df category value 0 one 2 1 one 4 2 one 3 3 one 2 4 two 5 5 two 6 6 two 5 7 three 7 8 three 8 9 three 6

I want to calculate a new column called normalized by computing the median (or any other groupby operation) and subtracting it (or any other simple operation) from the corresponding values in the non-grouped DataFrame. In non-pandas code this is what I mean:

new_column = [] # Groupby equivalent for cat in df["category"].unique(): curr_df = df[df["category"] == cat] curr_median = curr_df.median() # Calculation on groupby components for val in curr_df["value"]: normalized = val - curr_median new_column.append(normalized) df["normalized"] = new_column

Which results in the following DataFrame:

df = pd.DataFrame({ "category": ["one", "one", "one", "one", "two", "two", "two", "three", "three", "three"], "value": [2, 4, 3, 2, 5, 6, 5, 7, 8, 6], "normalized": [-0.5, 1.5, 0.5, -0.5, 0.0, 1.0, 0.0, 0.0, 1.0, -1.0] }) >>> df category value normalized 0 one 2 -0.5 1 one 4 1.5 2 one 3 0.5 3 one 2 -0.5 4 two 5 0.0 5 two 6 1.0 6 two 5 0.0 7 three 7 0.0 8 three 8 1.0 9 three 6 -1.0

How could I write this in a nicer, pandas way? Thanks in advance :)

Does this answer your question? Pandas groupby and correct with median in new column — Soumendra Mishra
– Soumendra Mishra, Commented Aug 28, 2020 at 9:59

anon01 · Accepted Answer · 2020-08-28 09:14:21Z

transform is your friend. I think of this as apply when I want to maintain the original dataframe shape. You can use this:

df["normalized"] = df.value - df.groupby("category").value.transform("median")

output:

 category value normalized 0 one 2 -0.5 1 one 4 1.5 2 one 3 0.5 3 one 2 -0.5 4 two 5 0.0 5 two 6 1.0 6 two 5 0.0 7 three 7 0.0 8 three 8 1.0 9 three 6 -1.0

Collectives™ on Stack Overflow

Make new column based on groupby calculation

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related