I discovered a strange behavior in Python Pandas and wanted to ask if it is my fault or if it is an actual program bug. Let's take the following DataFrame:
data = DataFrame({'k2':[1, 2, 3, ], 'name':['joe', 'mark', 'carl']}) data.set_index('name', drop=False, inplace=True) If I create a function which returns a Series object like that
def my_test(i, x): x['interrel'] = x.apply(lambda row: i['k2'] - row['k2'] if i['name'] != row['name'] else 0, axis=1) print x['interrel'] return x['interrel'] and appy that function using apply to the created DataFrame using
data.apply(lambda row: my_test(row, data), axis=1) all I get in the output is the last calculated row times three. However, the print statement in the my_test function shows that the calculations are correct. It seems that only the particular series objects are not appended correctly.
Can you reconstruct this problem? Did I get anything wrong regarding the use of the apply function?
Please consider that this is only an example, I am not asking for another way to do pairwise differences in Pandas
Any help is appreciated