Skip to content

Equality between DataFrames misbehaves if columns contain NaN #18455

@toobaz

Description

@toobaz

Code Sample, a copy-pastable example if possible

In [2]: s = pd.DataFrame(-1, index=[1, np.nan, 2,], ...: columns=[3, np.nan, 1]) ...: In [3]: s + s # good Out[3]: 3.0 NaN 1.0 1.0 -2 -2 -2 NaN -2 -2 -2 2.0 -2 -2 -2 In [4]: s == s # bad Out[4]: 3.0 NaN 1.0 1.0 True NaN True NaN True NaN True 2.0 True NaN True

Problem description

While it is true that np.nan != np.nan, pandas disregards this in indexes (indeed, s.loc[:, np.nan] works), so it should be coherent.

Expected Output

In [4]: s == s Out[4]: 3.0 NaN 1.0 1.0 True True True NaN True True True 2.0 True True True

Output of pd.show_versions()

INSTALLED VERSIONS

commit: b45325e
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.22.0.dev0+201.gb45325e28.dirty
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions