How to pivot pandas dataframe (with two headers)

Question

I have a Pandas dataframe which looks as follows:

Age	USA	USA	USA	UK	UK	UK
Age	male	female	total	male	female	total
2-year-old	2	3	5	1	1	2
3-year-old	8	8	16	7	9	16

In fact I have two row headers (USA + male; USA + female;..)

CSV-File (test.csv):

;USA;USA;USA;UK;UK;UK Age;male;female;total;male;female;total 2-year-old;2;3;5;1;1;2 3-year-old;8;8;16;7;9;16

My python-code:

df = pd.read_csv('test.csv', delimiter=";", header=[0,1]) df = df.rename(columns={'Unnamed: 0_level_0': 'Age'})

How to pivot pandas dataframe to get the following result?

Age	Country	Gender	frequency
2-year-old	USA	male	2
2-year-old	USA	female	3
2-year-old	UK	male	1
2-year-old	UK	female	1
3-year-old	USA	male	8
3-year-old	USA	female	8
3-year-old	UK	male	7
3-year-old	UK	female	9

EDIT:

Starting Table:

Kode	Country	Procedure	male	male	female	female
Kode	Country	Procedure	two-year-old	three-year-old	two-year-old	three-year-old
1a	US	proc_1	4	6	3	6
1a	UK	proc_1	2	3	5	1
1b	US	proc_2	15	3	5	2
1b	UK	proc_2	8	4	7	3

CSV:

Code;Country;Procedure;male;male;female;female Code;Country;Procedure;two-year-old;three-year-old;two-year-old;three-year-old 1a;US;proc_1;4;6;3;6 1a;UK;proc_1;2;3;5;1 1b;US;proc_2;15;3;5;2 1b;UK;proc_2;8;4;7;3

Resulting Table:

Code	Country	Procedure	Gender	Age	Frequency
1a	US	proc_1	male	two-year-old	4
1a	US	proc_1	male	three-year-old	6
1a	US	proc_1	female	two-year-old	3
1a	US	proc_1	female	three-year-old	6
1a	UK	proc_1	male	two-year-old	2
1a	UK	proc_1	male	three-year-old	3
1a	UK	proc_1	female	two-year-old	5
1a	UK	proc_1	female	three-year-old	1
1b	...

jezrael · Accepted Answer · 2021-11-05 10:48:06Z

Use DataFrame.set_index with DataFrame.stack, also if need remove total add drop:

df = (df.drop('total', axis=1, level=1) .set_index(df.columns[0]) .stack([0,1]) .rename_axis(['Age','Country','Gender']) .reset_index(name='frequency')) print (df) Age Country Gender frequency 0 2-year-old UK female 1 1 2-year-old UK male 1 2 2-year-old USA female 3 3 2-year-old USA male 2 4 3-year-old UK female 9 5 3-year-old UK male 7 6 3-year-old USA female 8 7 3-year-old USA male 8

Or:

df = (df.set_index(df.columns[0]) .stack([0,1]) .rename_axis(['Age','Country','Gender']) .reset_index(name='frequency')) print (df) Age Country Gender frequency 0 2-year-old UK female 1 1 2-year-old UK male 1 2 2-year-old UK total 2 3 2-year-old USA female 3 4 2-year-old USA male 2 5 2-year-old USA total 5 6 3-year-old UK female 9 7 3-year-old UK male 7 8 3-year-old UK total 16 9 3-year-old USA female 8 10 3-year-old USA male 8 11 3-year-old USA total 16

Hello jezreal. Could you pls have a look at my edited question. How would I get the required output in this case? Thanks.
@dublin123 - I will be offline now, can you post new question?

Collectives™ on Stack Overflow

How to pivot pandas dataframe (with two headers)

1 Answer 1

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Linked

Related