Skip to content

Calling drop_duplicates method for empty pandas dataframe throws errorΒ #20516

@analyticalmonk

Description

@analyticalmonk

Code Sample

>>> pd.DataFrame().drop_duplicates() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/analytical-monk/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3098, in drop_duplicates duplicated = self.duplicated(subset, keep=keep) File "/home/analytical-monk/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3144, in duplicated labels, shape = map(list, zip(*map(f, vals))) ValueError: not enough values to unpack (expected 2, got 0)

Problem description

Currently, calling the drop_duplicates method for an empty dataframe object (simply pd.DataFrame()) throws an error.
Ideally it should return back the empty dataframe just liked it does when at least one column is present.

Expected Output

>>> pd.DataFrame().drop_duplicates() Empty DataFrame Columns: [] Index: [] 

Output of pd.show_versions()

>>> pd.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.8.0-58-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_IN LOCALE: en_IN.ISO8859-1 pandas: 0.20.3 pytest: None pip: 9.0.1 setuptools: 36.6.0 Cython: None numpy: 1.13.1 scipy: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.6.0 html5lib: 1.0b10 sqlalchemy: 1.1.14 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None 

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions