Subtracting rows of dataframe A from dataframe B python pandas [duplicate]

Question

I have two dataframes, let's call them A and B. They have exactly the same 7 columns (let's call them col1, col2, col3, col4, col5, col6 and col7). Some of the columns include client_id, client_first_name, client_last_name, telephone number etc. (I can't reveal the exact names for confidentiality purposes).

DataFrame A is much bigger than DataFrame B and some of the entries from DataFrame B are included in DataFrame A (i.e. DataFrame B is a subset of DataFrame A).

The problem is, I want to make sure that the records in DataFrame A are NOT in DataFrame B, i.e. 'subtract' DataFrame B from DataFrame A. How do I do it?

So far, I've been adding an extra column entitled 'group' for both DataFrames, merging them using pd.merge(A, B, how='left', on='col) and then pulling out the ones that ended up with two different values for 'group_x' and 'group_y' (the merge created these two groups.

Is there an easier way to do it? I tried a bunch of things but none of them worked.

check this aswer: stackoverflow.com/a/28902170/2027457

n1tk
– n1tk

2016-12-01 00:34:21 +00:00
Commented Dec 1, 2016 at 0:34 — n1tk
– n1tk, Commented Dec 1, 2016 at 0:34

maxymoo · Accepted Answer · 2016-12-01 00:32:32Z

0

Yes your way is OK, you could also do something like dfA.ix[!dfA.col.isin(dbB.col)] if you don't need the merged dataframe.

answered Dec 1, 2016 at 0:32

maxymoo

36.7k12 gold badges97 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Subtracting rows of dataframe A from dataframe B python pandas [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related