2

I have two data frames: df1 as follows:

 col0 col1 col1.1 col3 0 a d 1 6 1 b e 5 7 

And df2 as follows:

 colx coly 0 a 10 1 b 20 2 d 50 3 e 40 

How do I combine the dataframes in-place such that the final df looks like this?

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40 

As far as I understand, the merge method in pandas merges both dataframes based on a given axis. In this case, I want to merge them based on the value in another dataframe. What is the function I am looking for in this case?

1
  • You will not be able to get an "in place" solution. Commented Aug 26, 2019 at 14:13

5 Answers 5

4

Using merge chained twice:

mrg = df1.merge(df2, left_on='col0', right_on='colx')\ .merge(df2, left_on='col1', right_on='colx')\ .drop(columns=['colx_x', 'colx_y']) 

Output

 col0 col1 col1.1 col3 coly_x coly_y 0 a d 1 6 10 50 1 b e 5 7 20 40 
Sign up to request clarification or add additional context in comments.

3 Comments

I get the following error: ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat
Are you merging on the correct columns, so the arguments left_on and right_on. Seems like you are selecting an object/string column and a numeric/float column
I made a mistake. my bad. apologies
3

You could use map:

mapper = df2.set_index('colx')['coly'] df1['colx'] = df1['col0'].map(mapper) df1['coly'] = df1['col1'].map(mapper) print(df1) 

Output:

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40 

Comments

2

Using applymap and get:

s=df2.set_index('colx')['coly'] df1[['colx','coly']]=df1.iloc[:,:2].applymap(lambda x: s.get(x)) print(df1) 

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40 

Comments

1

Another solution with replace:

dct = df2.set_index('colx')['coly'].to_dict() df1['coly_x'] = df1['col0'].replace(dct) df1['coly_y'] = df1['col1'].replace(dct) 

Output:

 col0 col1 col1.1 col3 coly_x coly_y 0 a d 1 6 10 50 1 b e 5 7 20 40 

Comments

0

Here is a method to join on the indexes:

df1.join([df2.iloc[0:2,1].rename('colx'), df2.iloc[2:,1].rename('coly').reset_index(drop=True)]) 

Elements of the desired column are extracted, subset, and renamed. The index of the second series is reset to match. Then these are joined to df1.

this returns

 col0 col1 col1.1 col3 colx coly 0 a d 1 6 10 50 1 b e 5 7 20 40 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.