-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Open
Labels
EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode
Description
Would it be possible to add a parameter to pd.merge/DataFrame.join which accepts a function that does the actual merging of the two joined rows.
def my_joiner(left,right): left.a = right.a if left.a == NaN else left.a left.b = right.b if left.cnt < 3 else left.b ... return left pd.merge(df1,df2,joiner=my_joiner) As I often use merge/join to impute missing or statistically unreliable data I find myself writeing code like the following over and over again:
df3 = pd.merge(df1,df2) df3['a'] = df3.a_x.fillna(df3.a_y) df3['b'] = df3.apply(lambda x: x.b_x if x.cnt > 3 else x.b_y) ... df3 = df3.drop([a_x, a_y, b_x, b_y]) Which is rather cumbersome and cluttered.
etrabelsi, fphammerle, qbit-git, StephenWithPH and meteosimon
Metadata
Metadata
Assignees
Labels
EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode