2

i think this is an easy question and I know where to look, using merge, join, loc, iloc or 1 of these functions but did not figure it out yet. Here a simplistic example what I want to do. df1 and df2 have the same columns but a different number of rows. Now I want to find rows where the column "t1" is the same for both dataframes and then replace the values in column "c1" of df1 with the values of column "c1" of df2 (so where their t1 value are the same). I also tried functions where and replace but I am pretty sure I need merge or join. Thank you.

import pandas as pd df1 = pd.DataFrame() df2 = pd.DataFrame() # close price df1.at[0,"c1"]=0 df1.at[1,"c1"]=0 df1.at[2,"c1"]=0 df1.at[3,"c1"]=0 df1.at[4,"c1"]=0 df1.at[5,"c1"]=0 df1.at[6,"c1"]=0 df1.at[7,"c1"]=0 df2.at[0,"c1"]=20 df2.at[1,"c1"]=26 df2.at[3,"c1"]=23 df2.at[4,"c1"]=21 # time stamp df1.at[0,"t1"]=3 df1.at[1,"t1"]=4 df1.at[2,"t1"]=5 df1.at[3,"t1"]=6 df1.at[4,"t1"]=7 df1.at[5,"t1"]=8 df1.at[6,"t1"]=9 df1.at[7,"t1"]=10 df2.at[0,"t1"]=5 df2.at[1,"t1"]=6 df2.at[3,"t1"]=7 df2.at[4,"t1"]=8 They look like: >>> df1 c1 t1 0 0.0 3.0 1 0.0 4.0 2 0.0 5.0 3 0.0 6.0 4 0.0 7.0 5 0.0 8.0 6 0.0 9.0 7 0.0 10.0 >>> df2 c1 t1 0 20.0 5.0 1 26.0 6.0 3 23.0 7.0 4 21.0 8.0 

So I want df1 to look like the frame shown below. At the rows where the value for "t1" is the same for both df1 and df2 I want to replace the values in column "c1" in df1 with the values from df2.

>>> df1 c1 t1 0 0.0 3.0 1 0.0 4.0 2 20.0 5.0 3 26.0 6.0 4 23.0 7.0 5 21.0 8.0 6 0.0 9.0 7 0.0 10.0 
0

2 Answers 2

2

You can use pd.merge for this:

df1 = df1.merge(df2, on=['t1'], how='left') 

Which results in:

 c1_x t1 c1_y 0 0.0 3.0 NaN 1 0.0 4.0 NaN 2 0.0 5.0 20.0 3 0.0 6.0 26.0 4 0.0 7.0 23.0 5 0.0 8.0 21.0 6 0.0 9.0 NaN 7 0.0 10.0 NaN 

It adds a new column c1_y which are the merged values from df2. To create the desired output we only need to do the following:

df1['c1'] = df1.c1_y.fillna(df1.c1_x) df1 = df1[['c1', 't1']] 

Output:

 c1 t1 0 0.0 3.0 1 0.0 4.0 2 20.0 5.0 3 26.0 6.0 4 23.0 7.0 5 21.0 8.0 6 0.0 9.0 7 0.0 10.0 
Sign up to request clarification or add additional context in comments.

2 Comments

i wanted to post a reply with an example but seems I can only add small comments. But it works even better than I hoped for since I was making this "dummy" column "c1" in df1 filled with zeros but this is not necessary. Too bad I can not add my expanded example. I tested on my real data en it works great
Indeed, without the dummy column it works more elegant ;)
1

Simple use merge:

res = pd.merge(df1, df2, on='t1', how='outer') df1['t1'] = res['c1_y'].fillna(df1['t1']) print(df1) ###output: ### c1 t1 ###0 0.0 3.0 ###1 0.0 4.0 ###2 0.0 20.0 ###3 0.0 26.0 ###4 0.0 23.0 ###5 0.0 21.0 ###6 0.0 9.0 ###7 0.0 10.0 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.