1

I have two Dataframes A and B. Both have same 4 columns. I want to merge the two data frames such that if first three column values match, then merge the id values(which is a jasonb array)

Sample data:

df_A

name age zip id abc 25 11111 ["2722", "2855", "3583"] 

df_B

name age zip id abc 25 11111 ["123", "234"] 

I want the final output to look like

Final output:

name age zip id ---------------------------------------------------------------- abc 25 11111 ["2722", "2855", "3583", "123", "234"] 
0

2 Answers 2

1

One quick solution will be

l=['name','age','zip'] df=(df1.set_index(l)+df2.set_index(l)).reset_index() 
Sign up to request clarification or add additional context in comments.

Comments

1

Another option is to merge, then use a list comprehension to handle the "id" columns.

output = df_A.merge(df_B, on=['name', 'age', 'zip']) output['id'] = [[*x, *y] for x, y in zip(output.pop('id_x'), output.pop('id_y'))] output name age zip id 0 abc 25 11111 [2722, 2855, 3583, 123, 234] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.