1
$\begingroup$

I have the following pandas dataframe: df_1:

 User Docs Pref user1 doc1 m1 user1 doc2 m2 user1 doc3 m1 user1 doc4 m3 user2 doc1 m1 user2 doc2 m2 user3 doc1 m3 user4 doc1 m2 

I need to get the data frames following:

 User m1Count m2Count m3Count user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 1 

I tried to use value_counts but couldn't to get what I want. Any help will be appreciated.

df = pd.DataFrame( { "User": ["user1", "user1", "user1", "user1","user2","user2","user3","user4"], "Docs": ["doc1", "doc2", "doc3", "doc4", "doc1", "doc2","doc1","doc1"], "Pref": ["m1", "m2", "m1", "m3", "m1", "m2", "m3", "m2"], }) 
$\endgroup$

3 Answers 3

2
$\begingroup$

You can use groupby with value_counts and unstack:

df.groupby("User")["Pref"].value_counts().unstack().fillna(0).astype(int) 
Pref m1 m2 m3 User user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 0 

If you want to clean the column and index names:

( df.groupby("User")["Pref"] .value_counts() .unstack() .fillna(0) .astype(int) .rename_axis(None) .rename_axis(None, axis="columns") .add_suffix("Count") ) 
 m1Count m2Count m3Count user1 2 1 1 user2 1 1 0 user3 0 0 1 user4 0 1 0 
$\endgroup$
0
$\begingroup$

This is a crosstab operation. Pandas has a built-in function for it. Add a suffix to column names, remove axis names and you're done. Read more about how it relates to groupby and pivot here (the other operations used in the other answers).

( pd.crosstab(df['User'], df['Pref']) .add_suffix('Count') .reset_index() .rename_axis(columns=None) ) 

result

$\endgroup$
0
$\begingroup$

You should use melt method of Pandas . it convert unique values of a column into a column name then you can combine or group by to know the account

https://www.geeksforgeeks.org/python-pandas-melt/

Df.pivot(index=['User','Docs'], columns ='Pref', values = 'Pref').groupby('User').count() 
$\endgroup$
0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.