2

I am trying to create a rule where as long as the sum of all data across each individual row in the dataframe is greater than one, the response will equal to one. Please see below.

import numpy as np import pandas as pd df1 = pd.DataFrame(np.random.randint(0,2,size=(10, 4)), columns=list('ABCD')) df1['Response'] = 0 df1 Out[14]: A B C D Response 0 0 0 0 0 0 1 0 1 1 0 0 2 1 1 1 1 0 3 0 0 0 0 0 4 0 1 1 1 0 5 1 1 0 0 0 6 1 1 0 0 0 7 0 1 1 1 0 8 0 0 0 0 0 9 0 1 1 1 0 

My attempt:

df1['Response'] = 1 if [sum(df1[i,:]) for i in range(10)] > 1 else 0 

However I get this error, instead of having three rows equal to zero and the remaining equal to 1 in the response column:

TypeError: unhashable type: 'slice' 

Any help would be appreciated. Thank you.

2
  • 1
    df1['Response'] = df1.sum(1).gt(1).astype(int) Commented Jan 9, 2019 at 22:58
  • Will all of your numbers be positive? Commented Jan 9, 2019 at 23:13

2 Answers 2

2

Check with clip_upper : set a upper boundary .

df.sum(1).clip_upper(1) Out[153]: 0 0 1 1 2 1 3 0 4 1 5 1 6 1 7 1 8 0 9 1 dtype: int64 
Sign up to request clarification or add additional context in comments.

Comments

0

Try this (it assumes all of the numbers are positive):

In [1]: import numpy as np ...: import pandas as pd ...: df1 = pd.read_clipboard() In [2]: df1 Out[2]: A B C D Response 0 0 0 0 0 0 1 0 1 1 0 0 2 1 1 1 1 0 3 0 0 0 0 0 4 0 1 1 1 0 5 1 1 0 0 0 6 1 1 0 0 0 7 0 1 1 1 0 8 0 0 0 0 0 9 0 1 1 1 0 In [3]: df1['Response'] = df1.any(1).astype(int) In [4]: df1 Out[4]: A B C D Response 0 0 0 0 0 0 1 0 1 1 0 1 2 1 1 1 1 1 3 0 0 0 0 0 4 0 1 1 1 1 5 1 1 0 0 1 6 1 1 0 0 1 7 0 1 1 1 1 8 0 0 0 0 0 9 0 1 1 1 1 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.