5

I would like to drop a given column from a pandas dataframe IF all the values in the column is "0%".

my df:

data = {'UK': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%'], 'US': ['0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%'], 'DE': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%'], 'FR': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%'] } dummy_df = pd.DataFrame(data, index= ['cat1','cat2','cat3','cat4','cat5','cat6','cat7','cat8'], columns=['UK', 'US', 'DE', 'FR']) 

my code so far:

dummy_df.drop(dummy_df == '0%',inplace=True) 

I get a value error:

ValueError: labels ['UK' 'US' 'DE' 'FR'] not contained in axis 
1

2 Answers 2

11
In [186]: dummy_df.loc[:, ~(dummy_df == '0%').all()] Out[186]: UK DE FR cat1 11% 11% 11% cat2 16% 16% 16% cat3 7% 7% 7% cat4 52% 52% 52% cat5 2% 2% 2% cat6 5% 5% 5% cat7 3% 3% 3% cat8 3% 3% 3% 

Explanation:

The comparison with '0%' you already got, this gives the following dataframe:

In [182]: dummy_df == '0%' Out[182]: UK US DE FR cat1 False True False False cat2 False True False False cat3 False True False False cat4 False True False False cat5 False True False False cat6 False True False False cat7 False True False False cat8 False True False False 

Now we want to know which columns has all Trues:

In [183]: (dummy_df == '0%').all() Out[183]: UK False US True DE False FR False dtype: bool 

And finally, we can index with these boolean values (but taking the opposite with ~ as want don't want to select where this is True): dummy_df.loc[:, ~(dummy_df == '0%').all()].

Similarly, you can also do: dummy_df.loc[:, (dummy_df != '0%').any()] (selects columns where at least one value is not equal to '0%')

Sign up to request clarification or add additional context in comments.

Comments

3

First get the columns where all values != '0%'

In [163]: cols = (dummy_df != '0%').any() In [164]: cols Out[164]: UK True US False DE True FR True dtype: bool 

Then call only cols columns which are True

In [165]: dummy_df[cols[cols].index] Out[165]: UK DE FR cat1 11% 11% 11% cat2 16% 16% 16% cat3 7% 7% 7% cat4 52% 52% 52% cat5 2% 2% 2% cat6 5% 5% 5% cat7 3% 3% 3% cat8 3% 3% 3% 

1 Comment

Shoudn't the all be any (see my answer), as the OP wants to delete the columns where all values are 0% (so a column with any value that is not 0% should be included)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.