1

I have the following dataframe:

df0:             A          B       C Date 2017-04-13  884.669983  139.389999  46.900002 2017-04-17  901.989990  141.419998  47.389999 2017-04-18  903.780029  140.960007  47.560001 2017-04-19  899.200012  142.270004  47.000000 2017-04-20  902.059998  143.800003  47.669998 2017-04-21  898.530029  143.679993  47.520000 

I am simply looking forward to create a new dataframe main_df that what does is substract the row in i+1 from the row i and turn that resulting row to absolute numbers and introduce it into a new dataframe:

Here is what I have tried:

main_df=pd.DataFrame() for i in range(len(df0)): main_df.iloc[i]=np.absolute(df0.iloc[i+1]-df0.iloc[i]) print(main_df) 

outputs the error single positional indexer is out-of-bounds

Which is quite confusing given that iterating with the iloc property has worked correctly in other ocasiones.

Your help would be highly appreciated.

2 Answers 2

3

pandas
Use diff

main_df = df0.diff(-1).abs() A B C Date 2017-04-13 17.320007 2.029999 0.489997 2017-04-17 1.790039 0.459991 0.170002 2017-04-18 4.580017 1.309997 0.560001 2017-04-19 2.859986 1.529999 0.669998 2017-04-20 3.529969 0.120010 0.149998 2017-04-21 NaN NaN NaN 

numpy

main_df = pd.DataFrame( np.abs(np.diff(df0.values, axis=0)), df0.index[:-1], df0.columns ) A B C Date 2017-04-13 17.320007 2.029999 0.489997 2017-04-17 1.790039 0.459991 0.170002 2017-04-18 4.580017 1.309997 0.560001 2017-04-19 2.859986 1.529999 0.669998 2017-04-20 3.529969 0.120010 0.149998 

OP's iteration
Notice I did three things to fix your code:

  1. I added columns to your initial dataframe
  2. I range from 0 to len(df0) - 1
  3. I looked up the index value for position i so I could use loc to assign new rows

main_df = pd.DataFrame(columns=df0.columns) for i in range(len(df0) - 1): idx = df0.index[i] main_df.loc[idx] = np.absolute(df0.iloc[i+1]-df0.iloc[i]) A B C Date 2017-04-13 17.320007 2.029999 0.489997 2017-04-17 1.790039 0.459991 0.170002 2017-04-18 4.580017 1.309997 0.560001 2017-04-19 2.859986 1.529999 0.669998 2017-04-20 3.529969 0.120010 0.149998 
Sign up to request clarification or add additional context in comments.

Comments

2

There is problem in last loop - you try select row with is not in df (iloc[i+1]), so get error.

Solution:

sub + shift + abs:

df = df.sub(df.shift(-1)).abs() print (df) A B C Date 2017-04-13 17.320007 2.029999 0.489997 2017-04-17 1.790039 0.459991 0.170002 2017-04-18 4.580017 1.309997 0.560001 2017-04-19 2.859986 1.529999 0.669998 2017-04-20 3.529969 0.120010 0.149998 2017-04-21 NaN NaN NaN 

Also if need remove last NaN row use iloc for select all rows without last:

df = df.sub(df.shift(-1)).abs().iloc[:-1] print (df) A B C Date 2017-04-13 17.320007 2.029999 0.489997 2017-04-17 1.790039 0.459991 0.170002 2017-04-18 4.580017 1.309997 0.560001 2017-04-19 2.859986 1.529999 0.669998 2017-04-20 3.529969 0.120010 0.149998 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.