I am trying to process and update rows in a dataframe through a function, and return the dataframe to finish using it. When I try to return the dataframe to the original function call, it returns a series and not the expected column updates. A simple example is below:
df = pd.DataFrame(['adam', 'ed', 'dra','dave','sed','mike'], index = ['a', 'b', 'c', 'd', 'e', 'f'], columns=['A']) def get_item(data): comb=pd.DataFrame() comb['Newfield'] = data #create new columns comb['AnotherNewfield'] = 'y' return pd.DataFrame(comb) Caling a function using apply:
>>> newdf = df['A'].apply(get_item) >>> newdf a A Newfield AnotherNewfield a adam st... b A Newfield AnotherNewfield e sed st... c A Newfield AnotherNewfield d dave st... d A Newfield AnotherNewfield d dave st... e A Newfield AnotherNewfield s NaN st... f A Newfield AnotherNewfield m NaN str(... Name: A, dtype: object >>> type(newdf) <class 'pandas.core.series.Series'> I assume that apply() is bad here, but am not quite sure how I 'should' be updating this dataframe via function otherwise.
Edit: I appologize but i seems I accidentally deleted the sample function on an edit. added it back here as I attempt a few other things I found in other posts.
Testing in a slightly different manner with individual variables - and returning multiple series variables -> seems to work so I will see if this is something I can do in my actual case and update.
def get_item(data): value = data #create new columns AnotherNewfield = 'y' return pd.Series(value),pd.Series(AnotherNewfield) df['B'], df['C'] = zip(*df['A'].apply(get_item))
df['A'] = df['A'].apply(get_item)'A'of the dataframe rather than assigning the modified series to a new varaible.Aso you have a series of dataframes.