Skip to content

BUG: Duplicate columns get duplicated in DataFrame.align #50845

@rhshadrach

Description

@rhshadrach
# With duplicate columns df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abb')) result = df.drop(columns='a').align(df, join="inner", copy=False) print(result[0]) # b b b b # 0 2 2 3 3 # 1 5 5 6 6 # Without duplicate columns df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abc')) result = df.drop(columns='a').align(df, join="inner", copy=False) print(result[0]) # b c # 0 2 3 # 1 5 6 # With duplicate columns but without the drop(columns=...) df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abb')) result = df.align(df, join="inner", copy=False) print(result[0]) # a b b # 0 1 2 3 # 1 4 5 6 

Duplicate columns get duplicated again in alignment. If self and other have the same columns, then a different code path is taken and the bug does not appear.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions