3

Select rows from a DataFrame based on True or False in a column in pandas:

For example,

import pandas as pd df = {'uid':["1", "1", "1", "1", "2", "2", "2", "2"], 'type': ["a", "a", "b", "a", "a", "b", "b", "a"], 'is_topup':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"], 'label':["FALSE", "FALSE", "TRUE", "FALSE","FALSE", "TRUE", "TRUE", "FALSE"]} df = pd.DataFrame(df) uid type is_topup label 0 1 a FALSE FALSE 1 1 a FALSE FALSE 2 1 b TRUE TRUE 3 1 a FALSE FALSE 4 2 a FALSE FALSE 5 2 b TRUE TRUE 6 2 b TRUE TRUE 7 3 a FALSE FALSE 8 3 b TRUE TRUE 9 3 b TRUE TRUE 10 3 a FALSE FALSE 

I want to select a row in conditions like is

 uid type is_topup label 0 1 a FALSE FALSE 1 1 a FALSE FALSE 2 1 b TRUE TRUE 4 2 a FALSE FALSE 5 2 b TRUE TRUE 7 3 a FALSE FALSE 8 3 b TRUE TRUE 

I tried to look at pandas documentation but did not find the answer.

2
  • what do you mean with "and stop in" ? Commented Aug 27, 2019 at 7:21
  • Can you explain which rows you want to drop. There doesn't seem to be a logical reason to drop them without you explaining which rows you want to drop. Why do I have to drop rows 3, 6, 9 and 10? Doesn't seem like droping duplicates (0 and 1 are duplicates). Commented Aug 27, 2019 at 7:33

2 Answers 2

1

Not sure the most efficient way but using idxmax:

new_df = df.groupby('uid').apply(lambda x: x[:(x['is_topup'] & x['label']).reset_index(drop=True).idxmax()+1]) print(new_df) 

Output:

 uid type is_topup label uid 1 0 1 a False False 1 1 a False False 2 1 b True True 2 4 2 a False False 5 2 b True True 3 7 3 a False False 8 3 b True True 
Sign up to request clarification or add additional context in comments.

2 Comments

yes, but "unsupported operand type(s) for &: 'str' and 'bool'"
is is_topup and label strs? In such a case, replace (x['is_topup'] & x['label']) with (x['is_topup'].eq('TRUE') & x['label'].eq('TRUE'))
0

It seems to me that a simple

result = df.drop_duplicates() 

should do the trick. At least your given example would work that way.

1 Comment

not like that, I want all uid and some row. example uid= 3 new, with 4 row and I want 2 row(7,8) is_topup and label is change to TRUE TRUE

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.