Method for missing data in rows? [duplicate]

Question

I am starting with programming. I would like to treat missing data by replacing it for the mean value of its column instead of its row-mean-value.

I'm trying to pre-process some data, from an Excel datasheet, similar to the one written as follows (the first column, the one with no title, is just the numbering/index; is not a real column):

 A B C D E F 0 100 NaN 5.0 1.0 5 a 1 200 5.0 NaN 3.0 3 a 2 300 4.0 NaN 5.0 6 a 3 400 5.0 4.0 7.0 9 b 4 500 5.0 2.0 NaN 2 b 5 600 4.0 3.0 0.0 4 b

The point is that I would like to replace NaN for the mean value of its column. I am using pandas to extract the values from the datasheet. Then I try with sklearn.impute.SimpleImputer to treat these missing data but it just gives me the mean value of the rows.

dataset = pd.read_excel(io=file_name, sheet_name=sheet) y = dataset.iloc[:, 0].values X = dataset.iloc[:, 1:-1].values from sklearn.impute import SimpleImputer imputer = SimpleImputer(missing_values = np.nan, strategy = 'mean') imputer = imputer.fit(X[:, :]) X[:, :] = imputer.transform(X[:, :])

Does anyone know how could I do that in an efficient way? Is there any method of a class that treats this case? (I do not mind getting into another library).

Thank you very much

Before converting to y and X it's as simple as dataset.fillna(dataset.mean()). The index in pandas is a very powerful tool. — ALollz
– ALollz, Commented Oct 11, 2019 at 15:49

Celius Stingher · Accepted Answer · 2019-10-11 15:50:07Z

If you want to fill NaN values with the mean of the column, you can simply use:

data = {'A':[100,200,300,400,500,600],'B':[np.nan,5,4,5,5,4],'C':[5,np.nan,np.nan,4,2,3],'D':[1,3,5,7,np.nan,0]} df = pd.DataFrame(data) df = df.fillna(df.mean()) print(df)

Output:

 A B C D 0 100 4.6 5.0 1.0 1 200 5.0 3.5 3.0 2 300 4.0 3.5 5.0 3 400 5.0 4.0 7.0 4 500 5.0 2.0 3.2 5 600 4.0 3.0 0.0

Collectives™ on Stack Overflow

Method for missing data in rows? [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related