As an addition to rafaelc's answer, here are the according times (from quickest to slowest) for the following setup
import numpy as np import pandas as pd s = pd.Series([x > 0.5 for x in np.random.random(size=1000)])
>>> timeit np.where(s)[0] 12.7 µs ± 77.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> timeit np.flatnonzero(s) 18 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The time difference to boolean indexing was really surprising to me, since the boolean indexing is usually more used.
>>> timeit s.index[s] 82.2 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> timeit s[s].index 1.75 ms ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If you need a np.array object, get the .values
>>> timeit s[s].index.values 1.76 ms ± 3.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If you need a slightly easier to read version <-- not in original answer
>>> timeit s[s==True].index 1.89 ms ± 3.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Using pd.Series.where <-- not in original answer
>>> timeit s.where(s).dropna().index 2.22 ms ± 3.32 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) >>> timeit s.where(s == True).dropna().index 2.37 ms ± 2.19 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using pd.Series.mask <-- not in original answer
>>> timeit s.mask(s).dropna().index 2.29 ms ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) >>> timeit s.mask(s == True).dropna().index 2.44 ms ± 5.82 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using list comprehension
>>> timeit [i for i in s.index if s[i]] 13.7 ms ± 40.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using Python's built-in filter
>>> timeit [*filter(s.get, s.index)] 14.2 ms ± 28.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using np.nonzero <-- did not work out of the box for me
>>> timeit np.nonzero(s) ValueError: Length of passed values is 1, index implies 1000.
Using np.argwhere <-- did not work out of the box for me
>>> timeit np.argwhere(s).ravel() ValueError: Length of passed values is 1, index implies 1000.
s = pd.Series([True, False, True, True, False, False, False, True], index=list('ABCDEFGH')). Expected output:Index(['A', 'C', 'D', 'H'], ...). Since some solutions (esp. all the np functions) drop the index and use the autonumber index.