How do you drop SOME rows based on specific valuem in a column?

Question

there! I have the following situation and any help would be very appreciated.

Let's say I have the following dataframe, containing 2 columns and 90 thousand rows (made this shorter so it can be easily reproduced):

 PRODUCT ID PROBLEM 0 1 OIL LEAK 1 2 FLAT TIRE 2 3 OIL LEAK 3 4 ENGINE ISSUES 4 5 ENGINE ISSUES 5 6 OIL LEAK 6 7 OIL LEAK 7 8 FLAT TIRE 8 9 FLAT TIRE 9 90000 OIL LEAK

I need to drop SOME rows (but not all) based on values from column 'PROBLEM'. Imagine the value 'OIL LEAK' appears in my dataframe 11 thousand times, but I want to keep only 50 entries of this value in my dataframe and delete all the other rows this value appears. For me, it's not important the index of the row that is being droppeg as long as I have 50 registers of this value remaining in my dataframe.

Is there a way to perform it? Thanks in advance!

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community
– Community Bot, Commented Jul 2, 2022 at 7:27

Anynamer · Accepted Answer · 2022-07-02 07:29:23Z

3

You can save 50 oil leaks and concat them after removing for instance?

leaks = df[df['PROBLEM'] == 'OIL LEAK'].head(50) df = df[df['PROBLEM'] != 'OIL LEAK'].concat(leaks)

edited Jul 2, 2022 at 7:29

answered Jul 2, 2022 at 7:16

Anynamer

3341 silver badge6 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

W.Coyote Over a year ago

Precisely what I needed. Thanks for the help! Kind regards!

Vitalizzare · Accepted Answer · 2022-07-02 07:48:25Z

In general we can use grouping with cumulative count like this:

df[df.groupby('PROBLEM').cumcount() < 50]

In order to apply this logic only to some values in the PROBLEM column:

counted = df.groupby('PROBLEM').cumcount() max_count = 50 problems_to_cut = ['OIL LEAK'] selected = df[~((counted >= max_count) & (df.PROBLEM.isin(problems_to_cut)))]

Hey there! Thanks for the help! Worked just fine! Another great solution! Best regards

Collectives™ on Stack Overflow

How do you drop SOME rows based on specific valuem in a column?

2 Answers 2

1 Comment

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Related