50

I have a pandas DataFrame with a column of integers. I want the rows containing numbers greater than 10. I am able to evaluate True or False but not the actual value, by doing:

df['ints'] = df['ints'] > 10 

I don't use Python very often so I'm going round in circles with this.

I've spent 20 minutes Googling but haven't been able to find what I need....

Edit:

 observationID recordKey gridReference siteKey siteName featureKey startDate endDate pTaxonVersionKey taxonName authority commonName ints 0 463166539 1767 SM90 NaN NaN 150161 12/02/2006 12/02/2006 NBNSYS0100004720 Pipistrellus pygmaeus (Leach, 1825) Soprano Pipistrelle 2006 1 463166623 4325 TL65 NaN NaN 168651 21/12/2008 21/12/2008 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2008 2 463166624 4326 TL65 NaN NaN 168651 18/01/2009 18/01/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009 3 463166625 4327 TL65 NaN NaN 168651 15/02/2009 15/02/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009 4 463166626 4328 TL65 NaN NaN 168651 19/12/2009 19/12/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009 
0

3 Answers 3

79

Sample DF:

In [79]: df = pd.DataFrame(np.random.randint(5, 15, (10, 3)), columns=list('abc')) In [80]: df Out[80]: a b c 0 6 11 11 1 14 7 8 2 13 5 11 3 13 7 11 4 13 5 9 5 5 11 9 6 9 8 6 7 5 11 10 8 8 10 14 9 7 14 13 

present only those rows where b > 10

In [81]: df[df.b > 10] Out[81]: a b c 0 6 11 11 5 5 11 9 7 5 11 10 9 7 14 13 

Minimums (for all columns) for the rows satisfying b > 10 condition

In [82]: df[df.b > 10].min() Out[82]: a 5 b 11 c 9 dtype: int32 

Minimum (for the b column) for the rows satisfying b > 10 condition

In [84]: df.loc[df.b > 10, 'b'].min() Out[84]: 11 

UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.

Sign up to request clarification or add additional context in comments.

4 Comments

I did df['ints'] = df[df.ints > 10] which executes. Now when I do df['ints'].min() I get ValueError: cannot convert float NaN to integer ??
@VinylWarmth, what is your goal? What do you want to achieve? Do you want to get rid of rows where df.ints <= 10
@VinylWarmth, could you please provide a sample data set (DF with 5 rows) and desired result set, because it's not clear what should happen to other rows and columns. Just update your question with sample DFs
I see where I'm going wrong now, thanks for your help
4

I am wondering why no one mentions the builtin functions pandas dataframe has is similar to isin method. Here is a quick example

 df = pd.DataFrame({'cost': [250, 150, 100], 'revenue': [100, 250, 300]},index=['A', 'B', 'C']) cost revenue A 250 100 B 150 250 C 100 300 

Compare DataFrames for greater than inequality or equality elementwise.

df=df[df["cost"].ge(150)] cost revenue A 250 100 B 150 250 

Comments

2

You can also use query:

In [2]: df = pd.DataFrame({'ints': range(9, 14), 'alpha': list('ABCDE')}) In [3]: df Out[3]: ints alpha 0 9 A 1 10 B 2 11 C 3 12 D 4 13 E In [4]: df.query('ints > 10') Out[4]: ints alpha 2 11 C 3 12 D 4 13 E 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.