Skip to content

Enhancement: 'Joiner'-Function for pd.merge/DataFrame.join #8962

@cstotzer

Description

@cstotzer

Would it be possible to add a parameter to pd.merge/DataFrame.join which accepts a function that does the actual merging of the two joined rows.

def my_joiner(left,right): left.a = right.a if left.a == NaN else left.a left.b = right.b if left.cnt < 3 else left.b ... return left pd.merge(df1,df2,joiner=my_joiner) 

As I often use merge/join to impute missing or statistically unreliable data I find myself writeing code like the following over and over again:

df3 = pd.merge(df1,df2) df3['a'] = df3.a_x.fillna(df3.a_y) df3['b'] = df3.apply(lambda x: x.b_x if x.cnt > 3 else x.b_y) ... df3 = df3.drop([a_x, a_y, b_x, b_y]) 

Which is rather cumbersome and cluttered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions