1

I have some code that help me to predic tsome missing values.This is the code

from datawig import SimpleImputer from datawig.utils import random_split from sklearn.metrics import f1_score, classification_report df_train, df_test = random_split(df, split_ratios=[0.8, 0.2]) # Initialize a SimpleImputer model imputer = SimpleImputer( input_columns=['SITUACION_DNI_A'], # columns containing information about the column we want to impute output_column='EXTRANJERO_A', # the column we'd like to impute values for output_path='imputer_model' # stores model data and metrics ) # Fit an imputer model on the train data imputer.fit(train_df=df_train, num_epochs=10) # Impute missing values and return original dataframe with predictions predictions = imputer.predict(df_test) 

After that i get a new dataframe with less rows than the original, how can i insert the values that i get in the prediction into my original dataframe, or there's is a way to run the code with all my dataframe and not the test

1 Answer 1

1

If both the dataframe have a unique column or something that can act like an ID, then this method will work

df_test = df_test.set_index('unique_col') df_test.fillna(predictions.set_index('unique_col')) 

If the above method does not work, then drop the rows with that missing values and append the imputer predictions to the dataframe. look the following links for help

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html

Delete rows if there are null values in a specific column in Pandas dataframe

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.