Skip to content

Multiindex slicing with NaNs, unexpected results #25154

@tunnij

Description

@tunnij

Code Sample, a copy-pastable example if possible

import pandas as pd df = pd.DataFrame( pd.np.random.rand(2, 3), columns=pd.MultiIndex.from_tuples([('a', 'foo'), ('b', 'bar'), ('b', pd.np.nan)], names=['first','second']) ) # EXPECTED slicing everything on first level df.loc[:, (['a', 'b'])] Out[35]: first a b second foo bar NaN 0 0.678021 0.383672 0.074164 1 0.738492 0.992545 0.661247 # EXPECTED just slicing one value from first level df.loc[:, (['b'])] Out[29]: first b second bar NaN 0 0.383672 0.074164 1 0.992545 0.661247 # EXPECTED slicing out b, bar df.loc[:, (['b'], ['bar'])] Out[33]: first b second bar 0 0.383672 1 0.992545 # UNEXPECTED slicing out b, nan df.loc[:, (['b'], [pd.np.nan])] Out[36]: Empty DataFrame Columns: [] Index: [0, 1] # UNEXPECTED slicing out b, [nan, 'bar'] df.loc[:, (['b'], ['bar', pd.np.nan])] Out[39]: first b second bar 0 0.383672 1 0.992545 # EXPECTED slicing out b, nan without the index df.loc[:, ('b', pd.np.nan)] Out[37]: 0 0.074164 1 0.661247 Name: (b, nan), dtype: float64

Problem description

When trying to slice out multiple values from a particular level including levels with a nan value, the levels with nan are not retrieved.

Expected Output

Both of these I expect to work:

df.loc[:, (['b'], ['bar', pd.np.nan])] Out[40]: first b second bar NaN 0 0.383672 0.074164 1 0.992545 0.661247 df.loc[:, (['b'], [pd.np.nan])] Out[40]: first b second NaN 0 0.074164 1 0.661247

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 2.7.15.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.36.3.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.22.0
pytest: 3.10.0
pip: 18.1
setuptools: 40.5.0
Cython: 0.28.5
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: 0.10.9
IPython: 5.8.0
sphinx: 1.8.1
patsy: 0.5.1
dateutil: 2.7.2
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.7
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.1
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions