1

This is how my df looks like.

 A B C D E F xyz abc aa 100 qq brc,pqr,lmn xyz abc bb 150 qq lmn,brc,ppq xyz abc cc 80 qq lmn,pqr abc pqr cc 99 qq pqr,brc,lmn abc pqr aa 180 qq brc,lmn,pqr abc pqr bb 200 qq lmn,pqr,brc 

this is what i want. how can i achieve the same?

A B aa bb cc E F xyz abc 100 150 80 qq brc,pqr,lmn,ppq abc pqr 180 200 99 qq brc,lmn,pqr 

1 Answer 1

1

Use DataFrame.pivot_table with DataFrame.join for another DataFrame created by GroupBy.agg:

f = lambda x: ','.join(set(z for y in x for z in y.split(','))) df1 = df.groupby(['A','B']).agg({'E':'first', 'F': f}) df = df.pivot_table(index=['A','B'], columns='C', values='D', aggfunc='mean').join(df1).reset_index() print (df) A B aa bb cc E F 0 abc pqr 180 200 99 qq lmn,brc,pqr 1 xyz abc 100 150 80 qq lmn,ppq,brc,pqr 
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you for responding but I am getting the error as 'No numeric types to aggregate' as the data type is string for values. what can i do about it?
@Prachi - You can use df.D = df.D.astype('int') or some another solumn from this like pd.to_numeric
ok, i did that. but my output doesnt look like how i want it. i actually got the column 'cc' twice and the values seperated too. i tried df.groupby(df.columns,axis=1).first() but it didnt work too.
@Prachi - There are not some space like columns are 'cc' and ' cc' ? What is print (df.columns.tolist()) after my solution?
@Prachi - Thank you, so problem is cc has space, so need before my solution df.C = df.C.str.strip() for convert ' cc' to 'cc'
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.