Combine similar DataFrame columns and stack values

Question

Currently, I have a dataframe that looks as such:

abc	def	ghi	abc	def	ghi
2	4	78	56	7	45

Is there a way to combine the columns that have the same name and create a new row for each set of values? Example:

abc	def	ghi
2	4	78
56	7	45

Cameron Riddell · Accepted Answer · 2022-07-29 21:17:55Z

You can use .groupby(level=0, axis='columns') to assign a cumulative count and then perform a transformation based on that.

import pandas as pd new_cols = pd.MultiIndex.from_arrays([df.columns, df.groupby(level=0, axis=1).cumcount()]) out = df.set_axis(new_cols, axis=1).stack().reset_index(level=0, drop=True) print(out) abc def ghi 0 2 4 78 1 56 7 45

mozway · Accepted Answer · 2022-07-29 21:21:06Z

You can set up a MultiIndex with help of groupby.cumcount and stack:

(df .set_axis(pd.MultiIndex .from_arrays([df.columns, df.groupby(level=0, axis=1) .cumcount()]), axis=1) .stack() .droplevel(0) )

Output:

 abc def ghi 0 2 4 78 1 56 7 45

Corralien · Accepted Answer · 2022-07-29 21:31:44Z

Just to give an alternative to other answers with melt:

out = (df.melt(var_name='col', value_name='val') .assign(idx=lambda x: x.groupby('col').cumcount()) .pivot('idx', 'col', 'val').rename_axis(index=None, columns=None)) print(out) # Output abc def ghi 0 2 4 78 1 56 7 45

sammywemmy · Accepted Answer · 2022-07-29 22:16:12Z

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor import pandas as pd import janitor df.pivot_longer(names_to = '.value', names_pattern = '(.+)') abc def ghi 0 2 4 78 1 56 7 45

In the above solution, the .value determines which parts of the column labels remain as headers - the labels are determined by the groups in the regular expression in names_pattern.

Another option would be to pass the names of the new columns to names_to, while passing a list of matching regular expressions to names_pattern :

df.pivot_longer(names_to = ['abc', 'def', 'ghi'], names_pattern = ['abc', 'def', 'ghi']) abc def ghi 0 2 4 78 1 56 7 45

I tried to invoke you on that one. I hope I used it properly :p

Suraj Rao · Accepted Answer · 2022-11-10 08:25:32Z

df1.T.rename_axis('col1',axis=1).assign(group=lambda dd:(dd.index=='abc').cumsum())\ .pivot_table(index='group',columns='col1',values=0)

col1 abc def ghi 1 2 4 78 2 56 7 45

Collectives™ on Stack Overflow

Combine similar DataFrame columns and stack values

5 Answers 5

Comments

Comments

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

1 Comment

Comments

Linked

Related