43

I have a pandas DataFrame df:

import pandas as pd data = {"Name": ["AAAA", "BBBB"], "C1": [25, 12], "C2": [2, 1], "C3": [1, 10]} df = pd.DataFrame(data) df.set_index("Name") 

which looks like this when printed (for reference):

 C1 C2 C3 Name AAAA 25 2 1 BBBB 12 1 10 

I would like to choose rows for which C1, C2 and C3 have values between 0 and 20.

Can you suggest an elegant way to select those rows?

0

4 Answers 4

58

I think below should do it, but its elegance is up for debate.

new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))] 
Sign up to request clarification or add additional context in comments.

5 Comments

Is there a way to use 'or' other than '&'
Love the elegance note :D
Use | in place of & for an "or" condition.
Note that even after nearly 9 years, it still won't work without the parentheses (even for two conditions), and one could argue, that it would be more intuitive.
This can be simplified using the 'between' function. Than you can drop half the conditions and the parentheses. df[df['C1'].between(0, 20) & df[df['C2'].between(0, 20) & df[df['C3'].between(0, 20)]
27

Shorter version:

In [65]: df[(df>=0)&(df<=20)].dropna() Out[65]: Name C1 C2 C3 1 BBBB 12 1 10 

Comments

23

I like to use df.query() for these kind of things

df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20') 

Comments

12

A more concise df.query:

df.query("0 <= C1 <= 20 and 0 <= C2 <= 20 and 0 <= C3 <= 20") 

or

df.query("0 <= @df <= 20").dropna() 

Using @foo in df.query refers to the variable foo in the environment.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.