pandas newbie here. I have a DataFrame with some values as '?' which I have successfully replaced with 'NaN'. I would like to replace 'NaN' with the average of the column, however, I am running into an issue where the 'NaN' is not removed. I've reviewed the solution below, but it does not work, per the below.
pandas DataFrame: replace nan values with average of columns
Code:
df = pd.DataFrame(cancer) print(df) df['A7'] = df['A7'].replace(['?'],"NaN") print(df) # the code below is where my issue arises df.fillna(df.mean()) print(df) Before ? is replaced with NaN:
Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 ? 7 3 1 4 Before NaN is replaced with mean:
Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 NaN 7 3 1 4 After NaN is replaced with average:
Scn A2 A3 A4 A5 A6 A7 A8 A9 A10 CLASS [.....] 21 1054593 10 5 5 3 6 7 7 10 1 4 22 1056784 3 1 1 1 2 1 2 1 1 2 23 1057013 8 4 5 1 2 NaN 7 3 1 4 I'm not sure what I am doing wrong.
df.fillna(df.mean())np.nanvs the string'NaN', they are not the same thing!