0

I am trying to understand how this works..

I have this df.

 ticket_id address grafitti_status 0 284932 10041 roseberry, Detroit MI NaN 1 285362 18520 evergreen, Detroit MI NaN 2 285361 18520 evergreen, Detroit MI NaN 3 285338 1835 central, Detroit MI NaN 4 285346 1700 central, Detroit MI NaN 5 285345 1700 central, Detroit MI NaN 

where

In: df.grafitti_status.unique() Out: array([nan, 'GRAFFITI TICKET'], dtype=object) 

So I am trying to change NaN to 0 and 'GRAFFITI TICKET' to 1.

I used

df.loc[df['grafitti_status'] == 'GRAFFITI TICKET', 'grafitti_status'] = 1 

which works fine, but the same for '0'

df.loc[df['grafitti_status'] == np.nan, 'grafitti_status'] = 0 Out: array([nan, 1], dtype=object) 

does not work because NaN values still remain..

and

df['grafitti_status'] = df['grafitti_status'].replace({np.nan:0,'GRAFFITI TICKET':1},inplace=True) 

does not work either, replacing everything with None.

 ticket_id address grafitti_status 0 284932 10041 roseberry, Detroit MI None 1 285362 18520 evergreen, Detroit MI None 2 285361 18520 evergreen, Detroit MI None 3 285338 1835 central, Detroit MI None 4 285346 1700 central, Detroit MI None 5 285345 1700 central, Detroit MI None 6 285347 1700 central, Detroit MI None 

Can anybody provide me any insight why it works this way?

I have finally found that I can achieve the desired result with

df.loc[df['grafitti_status'] == 'GRAFFITI TICKET', 'grafitti_status'] = 1 df['grafitti_status'] = df['grafitti_status'].fillna(0) Out: array([0, 1], dtype=int64) 

which leads to the following warning message.

C:\Users\Maria\Anaconda3\lib\site-packages\pandas\core\indexing.py:543: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[item] = s C:\Users\Maria\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead 

So I am still not sure what would be the correct way to do it?

1 Answer 1

2

Since

np.nan==np.nan will return False

We have function isna

df.loc[df['grafitti_status'].isna(), 'grafitti_status'] = 0 
Sign up to request clarification or add additional context in comments.

2 Comments

thanks! It indeed work. however, it is still giving me a warning message: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: pandas.pydata.org/pandas-docs/stable/… self.obj[item] = s
@bluetail you df is subbed of others df , when we do subset , we should add copy df=wholeddf[condition].copy()

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.