1

I have a dataframe:

 col1 col2 col3 col4 1 1 20 AA NaN 2 2 10 BB NaN 3 3 20 CC NaN 4 1 30 AA NaN 5 2 10 BB NaN 

I need to update col4 such that:

if col3 == 'AA': return col2 + col1 elif col2 == 'BB': return col2 - col1 else: return 

So the output will look something like this:

 col1 col2 col3 col4 1 1 20 AA 21 2 2 10 BB 8 3 3 20 CC NaN 4 1 30 AA 31 5 2 10 BB 8 

I've tried:

df['col4'][df['col3']=='AA'] = df['col2'] + df['col1'] df['col4'][df['col3']=='AA'] = df['col2'] + df['col1'] 

And naturally I get a SettingWithCopyWarning

What is the proper way to achieve this without the warning? I've tried apply and set_value, but I can't seem to pass so many df references into the params.

2 Answers 2

4

you can use nested np.where():

In [5]: df['col4'] = np.where(df.col3 == 'AA', df.col2 + df.col1, np.where(df.col3 == 'BB', df.col2 - df.col1, np.nan)) In [6]: df Out[6]: col1 col2 col3 col4 1 1 20 AA 21.0 2 2 10 BB 8.0 3 3 20 CC NaN 4 1 30 AA 31.0 5 2 10 BB 8.0 
Sign up to request clarification or add additional context in comments.

1 Comment

@Yeile, you are welcome. That's a common pattern for such tasks
1

An uncommon pattern :-)

a = ((df.col3.values[:, None] == ['AA', 'BB']) * [1, -1]).sum(1, keepdims=True) b = np.hstack([np.ones_like(a), np.where(a == 0, np.nan, a)]) df.assign(col4=df[['col1', 'col2']].mul(b).sum(1, skipna=False)) col1 col2 col3 col4 1 1 20 AA 21.0 2 2 10 BB -8.0 3 3 20 CC NaN 4 1 30 AA 31.0 5 2 10 BB -8.0 

2 Comments

It's a very creative solution! ;)
@piRSquared that hacking!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.