1

Replace the missing values in the variable "s_months" and "incidents" by the respective means of the other ships that share the same type AND the same operation period. Here "s_months" and "incidents" are two columns, having nan values, which we want to fill.

DataFrame named ship

I have tried to find means according to asked conditions. But unable to fill nan values in ship data frame. Here are means calculated, and stored as a data frame.

DataFrame namd shipgroup having means calculated based on cross product of "types" and "o_periods"

3
  • Have look at this, I hope it helps: Handling missing values Commented Sep 1, 2021 at 7:39
  • Please show an minimal reproducible example so we can see how far you got. Commented Sep 1, 2021 at 7:54
  • @Joooeey I have calculated means and stored them into another DataFrame. Now, I just have to look at any row or tuple in the first DataFrame where there are nan values. Then check its "type" and "o_period". Take values of means for that "types" and "o_periods" from the second DataFrame and fill in to the first one. Commented Sep 1, 2021 at 8:06

1 Answer 1

1

Use groupby and combine_first to fill NaN:

Minimal Reproducible Example:

>>> df types o_periods s_months incidents 0 1 2 63.0 0.0 1 1 2 1095.0 4.0 2 1 2 3353.0 18.0 3 1 2 NaN NaN 
keys = ['types', 'o_periods'] vals = ['s_months', 'incidents'] df[vals] = df[vals].combine_first(df.groupby(keys)[vals].transform('mean')) 

Output result:

>>> df types o_periods s_months incidents 0 1 2 63.000000 0.000000 1 1 2 1095.000000 4.000000 2 1 2 3353.000000 18.000000 3 1 2 1503.666667 7.333333 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.