Drop columns if rows contain a specific value in Pandas

Question

I am starting to learn Pandas. I have seen a lot of questions here in SO where people ask how to delete a row if a column matches certain value.

In my case it is the opposite. Imagine having this dataframe:

Where you want to know is, if any column has in any of its row the value salty, that column should be deleted, having as a result:

I have tried with several similarities to this:

if df.loc[df['A'] == 'salty']: df.drop(df.columns[0], axis=1, inplace=True)

But I am quite lost at finding documentation onto how to delete columns based on a row value of that column. That code is a mix of finding a specific column and deleting always the first column (as my idea was to search the value of a row in that column, in ALL columns in a for loop.

@cs95 I'm afraid you misread the question...

IanS
– IanS

2019-06-25 13:12:40 +00:00
Commented Jun 25, 2019 at 13:12 — IanS
– IanS, Commented Jun 25, 2019 at 13:12
@IanS Indeed the code was confusing.

cs95
– cs95

2019-06-25 13:13:27 +00:00
Commented Jun 25, 2019 at 13:13 — cs95
– cs95, Commented Jun 25, 2019 at 13:13

cs95 · Accepted Answer · 2019-06-25 13:14:22Z

Perform a comparison across your values, then use DataFrame.any to get a mask to index:

df.loc[:, ~(df == 'Salty').any()]

If you insist on using drop, this is how you need to do it. Pass a list of indices:

df.drop(columns=df.columns[(df == 'Salty').any()])

df = pd.DataFrame({ 'A': ['Mountain', 'Salty'], 'B': ['Lake', 'Hotty'], 'C': ['River', 'Coldy']}) df A B C 0 Mountain Lake River 1 Salty Hotty Coldy

(df == 'Salty').any() A True B False C False dtype: bool df.loc[:, ~(df == 'Salty').any()] B C 0 Lake River 1 Hotty Coldy

df.columns[(df == 'Salty').any()] # Index(['A'], dtype='object') df.drop(columns=df.columns[(df == 'Salty').any()]) B C 0 Lake River 1 Hotty Coldy

This: df.loc[:, ~(df == 'Salty').any()]worked just fine! Thank you!
@cs95 What if I wanted to check for a particular substring instead of a specific string?
@Mega_Noob You might find my answer on substring matching useful: stackoverflow.com/questions/11350770/…

grumpyTofu · Accepted Answer · 2020-10-06 14:33:08Z

The following is locating the indices where your desired column matches a specific value and then drops them. I think this is probably the more straightforward way of accomplishing this:

df.drop(df.loc[df['Your column name here'] == 'Match value'].index, inplace=True)

this was helpful, i was able to use a variation of this in a list comprehension to cull a lot of unwanted columns quickly. [df.drop(df.loc[df['col name'] == item].index, inplace=True) for item in some_list]

bexi · Accepted Answer · 2019-06-25 13:14:32Z

Here's one possibility:

df = df.drop([col for col in df.columns if df[col].eq('Salty').any()], axis=1)

Collectives™ on Stack Overflow

Drop columns if rows contain a specific value in Pandas

3 Answers 3

3 Comments

1 Comment

Comments

Linked

Hot Network Questions