0

pandas newbie here. I have a DataFrame with some values as '?' which I have successfully replaced with 'NaN'. I would like to replace 'NaN' with the average of the column, however, I am running into an issue where the 'NaN' is not removed. I've reviewed the solution below, but it does not work, per the below.

pandas DataFrame: replace nan values with average of columns

Code:

 df = pd.DataFrame(cancer) print(df) df['A7'] = df['A7'].replace(['?'],"NaN") print(df) # the code below is where my issue arises df.fillna(df.mean()) print(df) 

Before ? is replaced with NaN:

 Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 ? 7 3 1 4 

Before NaN is replaced with mean:

 Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 NaN 7 3 1 4 

After NaN is replaced with average:

 Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 NaN 7 3 1 4 

I'm not sure what I am doing wrong.

3
  • df.fillna(df.mean()) Commented Jul 15, 2018 at 2:02
  • That doesn't work - see updated question. Commented Jul 15, 2018 at 2:10
  • You may want to us np.nan vs the string 'NaN', they are not the same thing! Commented Jul 15, 2018 at 2:14

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.