Here's a piece of code, I don't get why on the last column rm-5, I get NaN for the first 4 items.
I understand that for the rm columns the 1st 4 items aren't filled because there is no data available, but if I shift the column calculation should be made, shouldn't it ?
Similarly I don't get why there are 5 and not 4 items in the rm-5 column that are NaN
import pandas as pd import numpy as np index = pd.date_range('2000-1-1', periods=100, freq='D') df = pd.DataFrame(data=np.random.randn(100), index=index, columns=['A']) df['rm']=pd.rolling_mean(df['A'],5) df['rm-5']=pd.rolling_mean(df['A'].shift(-5),5) print df.head(n=8) print df.tail(n=8) A rm rm-5 2000-01-01 0.109161 NaN NaN 2000-01-02 -0.360286 NaN NaN 2000-01-03 -0.092439 NaN NaN 2000-01-04 0.169439 NaN NaN 2000-01-05 0.185829 0.002341 0.091736 2000-01-06 0.432599 0.067028 0.295949 2000-01-07 -0.374317 0.064222 0.055903 2000-01-08 1.258054 0.334321 -0.132972 A rm rm-5 2000-04-02 0.499860 -0.422931 -0.140111 2000-04-03 -0.868718 -0.458962 -0.182373 2000-04-04 0.081059 -0.443494 -0.040646 2000-04-05 0.500275 -0.093048 NaN 2000-04-06 -0.253915 -0.008288 NaN 2000-04-07 -0.159256 -0.140111 NaN 2000-04-08 -1.080027 -0.182373 NaN 2000-04-09 0.789690 -0.040646 NaN