Using a Mask to Insert Values from sklearn Iterative Imputer

Question

I created a set of random missing values to practice with a tree imputer. However, I'm stuck on how to overwrite the missing values into the my dataframe. My missing values look like this:

from sklearn.experimental import enable_iterative_imputer from sklearn.impute import IterativeImputer df_post_copy = df_post.copy() missing_mask = df_post_copy.isna() imputer = IterativeImputer(max_iter=10, random_state=0) imputed_values = imputer.fit_transform(df_post_copy) df_copy[missing_mask] = imputed_values[missing_mask]

Results in:

ValueError: other must be the same shape as self when an ndarray

But the shape matches...

imputed_values.shape (16494, 29)

The type is:

type(imputed_values) numpy.ndarray

What I have tried since it is the right shape is to convert it to a pandas dataframe:

test_imputed_values = pd.DataFrame(imputed_values)

When I try:

df_copy[missing_mask] = test_imputed_values[missing_mask]

I get the same as above:

How do I use a mask to insert the imputed values where needed?

eschibli · Accepted Answer · 2024-05-07 17:02:40Z

imputer.fit_transform(...) returns both the original values and the (previously) missing values. If you want an updated DataFrame, something like

imputed_values = imputer.fit_transform(df_post_copy) df_post_copy.loc[:, :] = imputed_values

should work.

Arjun · Accepted Answer · 2024-06-01 13:10:26Z

imputed_values = imputer.fit_transform(df_post_copy)

imputer.fit_transform returns a numpy array after filling in the missing values.

So, imputed_values has all the missing values filled in. You could convert the dataframe the usual way.

pd.DataFrame(imputer.fit_transform(df_post_copy))

will return the original dataframe with missing values filled in.

Collectives™ on Stack Overflow

Using a Mask to Insert Values from sklearn Iterative Imputer

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related