Easier way of deleting rows in pandas dataframe based on condition from another dataframe

Question

Suppose I have two dataframes

df1 = pd.DataFrame({"A" : [1,1,2,5], "B" : [1,1,4,5], "C" : ["Adam","Bella","Charlie","Dan"]}) df2 = pd.DataFrame({"A" : [1,1,3,5], "B" : [1,3,6,5]})

and I want to delete the rows in df1 that have the same values of A and B with df2

I do this by

for i, row_1 in df1.iterrows(): for j, row_2 in df2.iterrows(): if row_1["A"] == row_2["A"] and row_1["B"] == row_2["B"]: index = i df1.drop([index], axis=0, inplace=False)

which resulted in, as intended

 A B C 2 2 4 Charlie

I was wondering if there was a much easier/faster way to do this especially if the data frame is large then it is not ideal to iterate over all the rows.

user7864386 · Accepted Answer · 2022-04-26 00:21:30Z

1

You can left-merge with the indicator parameter to flag the rows that match; then query to filter the rows that come only from df1:

out = df1.merge(df2, how='left', indicator=True).query('_merge=="left_only"').drop(columns=['_merge'])

Output:

 A B C 2 2 4 Charlie

answered Apr 26, 2022 at 0:21

user7864386

Sign up to request clarification or add additional context in comments.

4 Comments

SyntaxError101 Over a year ago

How does the above code satisfy this condition row_1["A"] == row_2["A"] and row_1["B"] == row_2["B"]

user7864386 Over a year ago

@PandaPandas it merges on the common columns (in this case A and B); i.e. it matches the A and B values in df1 with the A and B values in df2

SyntaxError101 Over a year ago

How would this work if the columns for the two dataframe did not share the same strings

user7864386 Over a year ago

@PandaPandas do you mean common column names? because columns don't contain strings. If there are no common column names, you'll have to specify which columns to merge on. So df1.merge(df2, how='left', left_on=['A','B'], right_on=['AA','BB'], indicator=True) etc.

rhug123 · Accepted Answer · 2022-04-26 00:26:20Z

1

Here is another way:

df1.loc[~df1.set_index(['A','B']).index.isin(df2.to_records(index=False).tolist())]

answered Apr 26, 2022 at 0:26

rhug123

8,8801 gold badge14 silver badges27 bronze badges

1 Comment

SyntaxError101 Over a year ago

How would this work if the columns for the two dataframe did not share the same strings

Onyambu · Accepted Answer · 2022-04-26 00:34:27Z

0

#!pip install siuba from siuba import anti_join anti_join(df1, df2, on = ['A', 'B']) A B C 2 2 4 Charlie

if you want to anti_join on all shared columns:

anti_join(df1, df2) A B C 2 2 4 Charlie

answered Apr 26, 2022 at 0:34

Onyambu

80.3k3 gold badges29 silver badges65 bronze badges

2 Comments

SyntaxError101 Over a year ago

How would this work if the columns for the two dataframe did not share the same strings

Onyambu Over a year ago

@PandaPandas will just give an empty dataframe

Collectives™ on Stack Overflow

Easier way of deleting rows in pandas dataframe based on condition from another dataframe

3 Answers 3

4 Comments

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

1 Comment

2 Comments

Related