1

I have a problem with removing rows from dataframe that occurs in another dataframe. Below simple example and expected results.

df1

A B
Z 1
X 2
C 3
V 4

df2

A B
DD 66
Z 1
X 2
CC 55

Expected output, df2 but rows that occur in df1 are dropped.

new df2:

A B
DD 66
CC 55

Edit: I need to match both A and B.

2
  • do you want to match on both A and B to remove? Commented Apr 13, 2022 at 14:06
  • 1
    @mozway yes, I need to match both A and B Commented Apr 13, 2022 at 14:09

2 Answers 2

1

IIUC, you can use a reverse merge with help of indicator=True:

(df2 .merge(df1, how='left', indicator=True) # if unrelated columns use on=['A', 'B'] .loc[lambda d: d.pop('_merge').eq('left_only')] ) 

output:

 A B 0 DD 66 3 CC 55 
Sign up to request clarification or add additional context in comments.

Comments

0

use pandasql:

df2.sql("select * from self where not exists (select 1 from df1 where df1.A=self.A and df1.B=self.B)",df1=df1) 

output:

 A B 0 DD 66 3 CC 55 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.