5

I have a list of NaN values in my dataframe and I want to replace NaN values with an empty string.

What I've tried so far, which isn't working:

df_conbid_N_1 = pd.read_csv("test-2019.csv",dtype=str, sep=';', encoding='utf-8') df_conbid_N_1['Excep_Test'] = df_conbid_N_1['Excep_Test'].replace("NaN","") 
0

3 Answers 3

11

Use fillna (docs): An example -

df = pd.DataFrame({'no': [1, 2, 3], 'Col1':['State','City','Town'], 'Col2':['abc', np.NaN, 'defg'], 'Col3':['Madhya Pradesh', 'VBI', 'KJI']}) df no Col1 Col2 Col3 0 1 State abc Madhya Pradesh 1 2 City NaN VBI 2 3 Town defg KJI df.Col2.fillna('', inplace=True) df no Col1 Col2 Col3 0 1 State abc Madhya Pradesh 1 2 City VBI 2 3 Town defg KJI 
Sign up to request clarification or add additional context in comments.

Comments

3

Simple! you can do this way

df_conbid_N_1 = pd.read_csv("test-2019.csv",dtype=str, sep=';',encoding='utf-8').fillna("") 

Comments

0

We have pandas' fillna to fill missing values.


Let's go through some uses cases with a sample dataframe:

df = pd.DataFrame({'col1':['John', np.nan, 'Anne'], 'col2':[np.nan, 3, 4]}) col1 col2 0 John NaN 1 NaN 3.0 2 Anne 4.0 

As mentioned in the docs, fillna accepts the following as fill values:

values: scalar, dict, Series, or DataFrame

So we can replace with a constant value, such as an empty string with:

df.fillna('') col1 col2 0 John 1 3 2 Anne 4 1 

You can also replace with a dictionary mapping column_name:replace_value:

df.fillna({'col1':'Alex', 'col2':2}) col1 col2 0 John 2.0 1 Alex 3.0 2 Anne 4.0 

Or you can also replace with another pd.Series or pd.DataFrame:

df_other = pd.DataFrame({'col1':['John', 'Franc', 'Anne'], 'col2':[5, 3, 4]}) df.fillna(df_other) col1 col2 0 John 5.0 1 Franc 3.0 2 Anne 4.0 

This is very useful since it allows you to fill missing values on the dataframes' columns using some extracted statistic from the columns, such as the mean or mode. Say we have:

df = pd.DataFrame(np.random.choice(np.r_[np.nan, np.arange(3)], (3,5))) print(df) 0 1 2 3 4 0 NaN NaN 0.0 1.0 2.0 1 NaN 2.0 NaN 2.0 1.0 2 1.0 1.0 2.0 NaN NaN 

Then we can easilty do:

df.fillna(df.mean()) 0 1 2 3 4 0 1.0 1.5 0.0 1.0 2.0 1 1.0 2.0 1.0 2.0 1.0 2 1.0 1.0 2.0 1.5 1.5 

Comments