Using GroupBy to add new columns to data frame

Question

I have a data frame like so

id val1 val2 0 A B 1 C D 1 E F 2 G H

and trying to reshape into...

id val1 val2 val3 val4 0 A B 1 C D E F 2 G H

It doesn't matter what the additional column names are and I may not know how many duplicates there are of each id, so I may not know exactly how many columns to add.

Any advice for solving a problem like this? I've been trying to use pandas and groupBy, but I'm not constrained to either. Thanks!

cs95 · Accepted Answer · 2019-06-30 14:28:05Z

3

This is a pivot problem, but you'll need to convert your frame from wide to long before you can pivot it:

u = df.melt('id') u.assign(variable=u.groupby('id').cumcount()).pivot(*u) variable 0 1 2 3 id 0 A B NaN NaN 1 C E D F 2 G H NaN NaN

answered Jun 30, 2019 at 14:28

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Frankie Over a year ago

Thanks! I wasn't able to quite make it work for my situation, can you take a look at my edit?

cs95 Over a year ago

@Frankie You've over simplified your problem to the extent that a solution cannot scale. But try this: df.set_index(['image', df.groupby('image').cumcount()]).unstack()

Collectives™ on Stack Overflow

Using GroupBy to add new columns to data frame

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related