Suppose I have a dataframe where some columns contain a zero value as one of their elements (or potentially more than one zero). I don't specifically want to retrieve these columns or discard them (I know how to do that) - I just want to locate these. For instance: if there is are zeros somewhere in the 4th, 6th and the 23rd columns, I want a list with the output [4,6,23].
3 Answers
Use any() for the fastest, vectorized approach.
For instance,
df = pd.DataFrame({'col1': [1, 2, 3], 'col2': [0, 100, 200], 'col3': ['a', 'b', 'c']}) Then,
>>> s = df.eq(0).any() col1 False col2 True col3 False dtype: bool From here, it's easy to get the indexes. For example,
>>> s[s].tolist() ['col2'] Many ways to retrieve the indexes from a pd.Series of booleans.
Comments
Here is an approach that leverages a couple of lambda functions:
d = {'a': np.random.randint(10, size=100), 'b': np.random.randint(1,10, size=100), 'c': np.random.randint(10, size=100), 'd': np.random.randint(1,10, size=100) } df = pd.DataFrame(d) df.apply(lambda x: (x==0).any())[lambda x: x].reset_index().index.to_list() [0, 2] Another idea based on @rafaelc slick answer (but returning relative locations of the columns instead of column names):
df.eq(0).any().reset_index()[lambda x: x[0]].index.to_list() [0, 2] Or with the column names instead of locations:
df.apply(lambda x: (x==0).any())[lambda x: x].index.to_list() ['a', 'c']