Pandas Dataframe grouping and summarizing

Question

I have the following pandas dataframe: df_1:

 User Docs Pref user1 doc1 m1 user1 doc2 m2 user1 doc3 m1 user1 doc4 m3 user2 doc1 m1 user2 doc2 m2 user3 doc1 m3 user4 doc1 m2

I need to get the data frames following:

 User m1Count m2Count m3Count user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 1

I tried to use value_counts but couldn't to get what I want. Any help will be appreciated.

df = pd.DataFrame( { "User": ["user1", "user1", "user1", "user1","user2","user2","user3","user4"], "Docs": ["doc1", "doc2", "doc3", "doc4", "doc1", "doc2","doc1","doc1"], "Pref": ["m1", "m2", "m1", "m3", "m1", "m2", "m3", "m2"], })

Erfan · Accepted Answer · 2022-08-20 14:50:18Z

You can use groupby with value_counts and unstack:

df.groupby("User")["Pref"].value_counts().unstack().fillna(0).astype(int)

Pref m1 m2 m3 User user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 0

If you want to clean the column and index names:

( df.groupby("User")["Pref"] .value_counts() .unstack() .fillna(0) .astype(int) .rename_axis(None) .rename_axis(None, axis="columns") .add_suffix("Count") )

 m1Count m2Count m3Count user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 0

cottontail · Accepted Answer · 2022-08-21 15:39:10Z

This is a crosstab operation. Pandas has a built-in function for it. Add a suffix to column names, remove axis names and you're done. Read more about how it relates to groupby and pivot here (the other operations used in the other answers).

( pd.crosstab(df['User'], df['Pref']) .add_suffix('Count') .reset_index() .rename_axis(columns=None) )

cottontail · Accepted Answer · 2022-08-21 15:57:09Z

You should use melt method of Pandas . it convert unique values of a column into a column name then you can combine or group by to know the account

https://www.geeksforgeeks.org/python-pandas-melt/

Df.pivot(index=['User','Docs'], columns ='Pref', values = 'Pref').groupby('User').count()

Stack Exchange Network

Pandas Dataframe grouping and summarizing

3 Answers 3

Hot Network Questions

Pandas Dataframe grouping and summarizing

3 Answers 3

Related

Hot Network Questions