I have a sparse dataframe with integer values. For example we create df as
df = pd.DataFrame(np.nan, index=range(10), columns=['A', 'B', 'C']) df.loc[(0,'A')] = 6 df.loc[(3,'A')] = 8 df.loc[(4,'B')] = 2 and it looks like this
A B C 0 6 NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN 3 8 NaN NaN 4 NaN 2 NaN 5 NaN NaN NaN 6 NaN NaN NaN 7 NaN NaN NaN 8 NaN NaN NaN 9 NaN NaN NaN Now I want to recursively fill each nan value with the previous value -1 (if it is not nan). For example this code does the trick:
for j in range(len(df.index)): df = df.fillna(value=df.shift(1)-1, limit=1) and it produces
A B C 0 6 NaN NaN 1 5 NaN NaN 2 4 NaN NaN 3 8 NaN NaN 4 7 2 NaN 5 6 1 NaN 6 5 0 NaN 7 4 -1 NaN 8 3 -2 NaN 9 2 -3 NaN The problem is that this code applied to a "real" dataframe is slow as hell, even if I have a bound on the range of j. Since it looks like very close to a simple df.fillna(method='ffill'), which is way faster, I was wondering if there is a way to speed this process up.
Thanks in advance for any answer, insight or comment.