1

given the dataframe df

df = pd.DataFrame(data=[[np.nan,1], [np.nan,np.nan], [1,2], [2,3], [np.nan,np.nan], [np.nan,np.nan], [3,4], [4,5], [np.nan,np.nan], [np.nan,np.nan]],columns=['A','B']) df Out[16]: A B 0 NaN 1.0 1 NaN NaN 2 1.0 2.0 3 2.0 3.0 4 NaN NaN 5 NaN NaN 6 3.0 4.0 7 4.0 5.0 8 NaN NaN 9 NaN NaN 

I would need to replace the nan using the following rules:

1) if nan is at the beginning replace with the first values after the nan

2) if nan is in the middle of 2 or more values replace the nan with the average of these values

3) if nan is at the end replace with the last value

df Out[16]: A B 0 1.0 1.0 1 1.0 1.5 2 1.0 2.0 3 2.0 3.0 4 2.5 3.5 5 2.5 3.5 6 3.0 4.0 7 4.0 5.0 8 4.0 5.0 9 4.0 5.0 

1 Answer 1

5

Use add between forward filling and backfilling values, then divide by 2 and last replace last and first NaNs:

df = df.bfill().add(df.ffill()).div(2).ffill().bfill() print (df) A B 0 1.0 1.0 1 1.0 1.5 2 1.0 2.0 3 2.0 3.0 4 2.5 3.5 5 2.5 3.5 6 3.0 4.0 7 4.0 5.0 8 4.0 5.0 9 4.0 5.0 

Detail:

print (df.bfill().add(df.ffill())) A B 0 NaN 2.0 1 NaN 3.0 2 2.0 4.0 3 4.0 6.0 4 5.0 7.0 5 5.0 7.0 6 6.0 8.0 7 8.0 10.0 8 NaN NaN 9 NaN NaN 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.