2

I have two dataframes df1 and df2. I am trying to join (left join)

df1:

Name ID Age AA 1 23 BB 2 49 CC 3 76 DD 4 27 EE 5 43 FF 6 34 GG 7 65 

df2:

ID Place 1 Germany 3 Holland 7 India 

Final = df1.join(df2, on=['ID'], how='left')

 Name ID Age Place AA 1 23 Germany BB 2 49 null CC 3 76 Holland DD 4 27 null EE 5 43 null FF 6 34 null GG 7 65 India 

But I would like to fill the Place column with Name column value if place value is null

Expected output:

 Name ID Age Place AA 1 23 Germany BB 2 49 BB CC 3 76 Holland DD 4 27 DD EE 5 43 EE FF 6 34 FF GG 7 65 India 

Solution, I can think of is, once the join is completed, I can check the value of Place and replace with Name if it's null. Please let me know if there are any other elegant way of solution. Thanks.

2
  • 2
    try this: final = df1.merge(df2,on='ID',how='left').assign(Place=lambda x: x['Place'].fillna(x['Name'])) Commented Jan 30, 2020 at 3:34
  • Sorry, I forgot to mention that I am trying to do this in pyspark dataframe. merge and assign does not work with pyspark dataframe. Thanks. Commented Jan 30, 2020 at 5:21

1 Answer 1

3

Yes, Thanks. After some search managed to use as shown in the link below

from pyspark.sql.functions import coalesce df1.withColumn("Place",coalesce(df1.Place,df.Name)) 

Another thread

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.