1

I am trying to create a new column amount_0_flag for a df, the values in that column are based on groupby another column key, for which if amount sum is 0, assigned True to amount_0_flag, otherwise False. The df looks like,

key amount amount_0_flag negative_amount 1 1.0 True False 1 1.0 True True 2 2.0 False True 2 3.0 False False 2 4.0 False False 

so when df.groupby('key'), cluster with key=1, will be assigned True to amount_0_flag for each element of the cluster, since within the cluster, one element has negative 1 and another element has postive 1 as their amounts.

df.groupby('key')['amount'].sum() 

only gives the sum of amount for each cluster not considering values in negative_amount and I am wondering how to also find the cluster and its rows with 0 sum amounts consdering negative_amount values using pandas/numpy.

1
  • Do you mind to write the original dataframe and the output you are looking forward? Commented Dec 7, 2017 at 18:33

1 Answer 1

2

Let's try this where I created a 'new_column' showing the comparison to your 'amount_0_flag':

df['new_column'] = (df.assign(amount_n = df.amount * np.where(df.negative_amount,-1,1)) .groupby('key')['amount_n'] .transform(lambda x: sum(x)<=0)) 

Output:

 key amount amount_0_flag negative_amount new_column 0 1 1.0 True False True 1 1 1.0 True True True 2 2 2.0 False True False 3 2 3.0 False False False 4 2 4.0 False False False 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.