Merge multiple dataframes in pandas

Question

I have two data frames: df1 as follows:

 col0 col1 col1.1 col3 0 a d 1 6 1 b e 5 7

And df2 as follows:

 colx coly 0 a 10 1 b 20 2 d 50 3 e 40

How do I combine the dataframes in-place such that the final df looks like this?

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40

As far as I understand, the merge method in pandas merges both dataframes based on a given axis. In this case, I want to merge them based on the value in another dataframe. What is the function I am looking for in this case?

You will not be able to get an "in place" solution.

lmo
– lmo

2019-08-26 14:13:16 +00:00
Commented Aug 26, 2019 at 14:13 — lmo
– lmo, Commented Aug 26, 2019 at 14:13

Erfan · Accepted Answer · 2019-08-26 14:11:19Z

4

Using merge chained twice:

mrg = df1.merge(df2, left_on='col0', right_on='colx')\ .merge(df2, left_on='col1', right_on='colx')\ .drop(columns=['colx_x', 'colx_y'])

Output

 col0 col1 col1.1 col3 coly_x coly_y 0 a d 1 6 10 50 1 b e 5 7 20 40

answered Aug 26, 2019 at 14:11

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

tandem Over a year ago

I get the following error: ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat

Erfan Over a year ago

Are you merging on the correct columns, so the arguments left_on and right_on. Seems like you are selecting an object/string column and a numeric/float column

tandem Over a year ago

I made a mistake. my bad. apologies

Scott Boston · Accepted Answer · 2019-08-26 14:16:27Z

You could use map:

mapper = df2.set_index('colx')['coly'] df1['colx'] = df1['col0'].map(mapper) df1['coly'] = df1['col1'].map(mapper) print(df1)

Output:

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40

anky · Accepted Answer · 2019-08-26 14:13:50Z

Using applymap and get:

s=df2.set_index('colx')['coly'] df1[['colx','coly']]=df1.iloc[:,:2].applymap(lambda x: s.get(x)) print(df1)

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40

Quang Hoang · Accepted Answer · 2019-08-26 14:16:47Z

Another solution with replace:

dct = df2.set_index('colx')['coly'].to_dict() df1['coly_x'] = df1['col0'].replace(dct) df1['coly_y'] = df1['col1'].replace(dct)

Output:

 col0 col1 col1.1 col3 coly_x coly_y 0 a d 1 6 10 50 1 b e 5 7 20 40

lmo · Accepted Answer · 2019-08-26 14:25:39Z

Here is a method to join on the indexes:

df1.join([df2.iloc[0:2,1].rename('colx'), df2.iloc[2:,1].rename('coly').reset_index(drop=True)])

Elements of the desired column are extracted, subset, and renamed. The index of the second series is reset to match. Then these are joined to df1.

this returns

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40

Collectives™ on Stack Overflow

Merge multiple dataframes in pandas

5 Answers 5

3 Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

Comments

Comments

Comments

Comments

Linked

Related