Skip to content

series __finalized__ not correctly called in merge? #6923

@wcbeard

Description

@wcbeard

I got some help from Jeff on stackoverflow, but either I'm misunderstanding the way __finalized__ works, or there's a bug in how it's called. My intent was to preserve series metadata after 2 dataframes being merged, and I believe __finalize__ should be able to handle this.

I define a couple dataframes, and assign metadata values to all the series:

import numpy as np import pandas as pd np.random.seed(10) df1 = pd.DataFrame(np.random.randint(0, 4, (3, 2)), columns=['a', 'b']) df2 = pd.DataFrame(np.random.randint(0, 4, (3, 2)), columns=['c', 'd']) df1 a b 0 1 1 1 0 3 2 0 1 df2 c d 0 3 0 1 1 1 2 0 1 

Then I assign metadata field filename to series

pd.Series._metadata = ['name', 'filename'] for c1 in df1: df1[c1].filename = 'fname1.csv' for c2 in df2: df2[c2].filename = 'fname2.csv' 

Now, I'm defining __finalize__ for series, which I understand is able to propagate metadata from one series to the other, for example when I want to merge. But when I define a __finalize__ that prints off the metadata that I've already assigned, it looks like by the time it calls __finalize__, it no longer has the metadata.

def finalize_ser(self, other, method=None, **kwargs): print 'Self meta: {}'.format(getattr(self, 'filename', None)) print 'Other meta: {}'.format(getattr(other, 'filename', None)) for name in self._metadata: object.__setattr__(self, name, getattr(other, name, '')) return self pd.Series.__finalize__ = finalize_ser 

When I call merge, I never see the correct metadata printed off

df1.merge(df2, left_on=['a'], right_on=['c'], how='inner') Self meta: None Other meta: None Self meta: None Other meta: None Self meta: None Other meta: None Self meta: None Other meta: None Out[5]: a b c d 0 1 1 1 1 1 0 3 0 1 2 0 1 0 1 

It appears the metadata is lost before it gets to the __finalize__ call, though it's still in the original series

df1.a.filename # => 'fname1.csv' mgd.a.filename # => AttributeError 

Is this expected or is there a bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, ExplodeTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions