Pandas: How to remove rows from a dataframe based on a list?

Question

I have a dataframe customers with some "bad" rows, the key in this dataframe is CustomerID. I know I should drop these rows. I have a list called badcu that says [23770, 24572, 28773, ...] each value corresponds to a different "bad" customer.

Then I have another dataframe, lets call it sales, so I want to drop all the records for the bad customers, the ones in the badcu list.

If I do the following

sales[sales.CustomerID.isin(badcu)]

I got a dataframe with precisely the records I want to drop, but if I do a

sales.drop(sales.CustomerID.isin(badcu))

It returns a dataframe with the first row dropped (which is a legitimate order), and the rest of the rows intact (it doesn't delete the bad ones), I think I know why this happens, but I still don't know how to drop the incorrect customer id rows.

you should drop by indexes

Eliethesaiyan
– Eliethesaiyan

2017-04-07 04:29:53 +00:00
Commented Apr 7, 2017 at 4:29 — Eliethesaiyan
– Eliethesaiyan, Commented Apr 7, 2017 at 4:29
Use sales[~sales.CustomerID.isin(badcu)]

Vaishali
– Vaishali

2017-04-07 04:38:37 +00:00
Commented Apr 7, 2017 at 4:38 — Vaishali
– Vaishali, Commented Apr 7, 2017 at 4:38

Vaishali · Accepted Answer · 2017-04-07 04:42:26Z

77

You need

new_df = sales[~sales.CustomerID.isin(badcu)]

answered Apr 7, 2017 at 4:42

Vaishali

38.5k5 gold badges62 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ah bon Over a year ago

I use your method to exclude rows based on iphone numbers of my dataframe, it doesn't work. weird.

ah bon Over a year ago

The error is as follows: TypeError: isin() takes 2 positional arguments but 69 were given

Vaishali Over a year ago

@ahbon, did you pass a list of arguments?

piRSquared · Accepted Answer · 2017-04-07 05:05:38Z

7

You can also use query

sales.query('CustomerID not in @badcu')

answered Apr 7, 2017 at 5:05

piRSquared

296k68 gold badges509 silver badges654 bronze badges

2 Comments

ah bon Over a year ago

I use also this method to exclude rows based on iphone numbers of my dataframe, it doesn't work. weird.

abhivemp Over a year ago

@ahbon may be a bit late on the comment but how did it not work? What error did you get? Different result?

Eliethesaiyan · Accepted Answer · 2017-04-07 04:37:32Z

I think the best way is to drop by index,try it and let me know

sales.drop(sales[sales.CustomerId.isin(badcu)].index.tolist())

Collectives™ on Stack Overflow

Pandas: How to remove rows from a dataframe based on a list?

3 Answers 3

3 Comments

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

2 Comments

Comments

Related