.loc sometimes raises KeyError without an error message when called on an unsorted MultiIndex DataFrame

Hello,

I know it is well documented that MultiIndex DataFrames need to be sorted to use slicing, and that is fine. Even if you forget this, in most cases (for example when using .loc with a slicer) Pandas gives a helpful error message when you try to call it on an unsorted DataFrame, which makes it easy to spot the mistake and add the necessary sorting. However when simply using .loc without a slicer, the same KeyError exception is raised without an error message, which looks like as if it was a legit key error.

Code Sample, a copy-pastable example if possible

Create a test DataFrame

iterables = [['a', 'b'], [2, 1]] columns = pd.MultiIndex.from_product(iterables, names=['col1', 'col2']) rows = pd.MultiIndex.from_product(iterables, names=['row1', 'row2']) df = pd.DataFrame(np.random.randn(4, 4), index=rows, columns=columns) print(df)

col1 a b col2 2 1 2 1 row1 row2 a 2 -1.285010 0.183851 -1.180964 0.885343 1 0.213501 0.479927 0.142614 0.064209 b 2 0.250557 -0.612791 -0.275680 -0.134086 1 -0.853687 -2.397638 0.940984 1.133747

Try to call .loc without a slicer

df.loc['a', 'b']

--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-28-b77cac191687> in <module>() ----> 1 df.loc['a', 'b'] /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in __getitem__(self, key) 1223 def __getitem__(self, key): 1224 if type(key) is tuple: -> 1225 return self._getitem_tuple(key) 1226 else: 1227 return self._getitem_axis(key, axis=0) /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup) 736 def _getitem_tuple(self, tup): 737 try: --> 738 return self._getitem_lowerdim(tup) 739 except IndexingError: 740 pass /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in _getitem_lowerdim(self, tup) 849 ax0 = self.obj._get_axis(0) 850 if isinstance(ax0, MultiIndex): --> 851 result = self._handle_lowerdim_multi_index_axis0(tup) 852 if result is not None: 853 return result /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in _handle_lowerdim_multi_index_axis0(self, tup) 831 ax0 = self.obj._get_axis(0) 832 if not ax0.is_lexsorted_for_tuple(tup): --> 833 raise e1 834 835 return None /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in _handle_lowerdim_multi_index_axis0(self, tup) 820 try: 821 # fast path for series or for tup devoid of slices --> 822 return self._get_label(tup, axis=0) 823 except TypeError: 824 # slices are unhashable /opt/conda/lib/python3.4/site-packages/pandas/core/indexing.py in _get_label(self, label, axis) 84 raise IndexingError('no slices here, handle elsewhere') 85 ---> 86 return self.obj._xs(label, axis=axis) 87 88 def _get_loc(self, key, axis=0): /opt/conda/lib/python3.4/site-packages/pandas/core/generic.py in xs(self, key, axis, level, copy, drop_level) 1482 if isinstance(index, MultiIndex): 1483 loc, new_index = self.index.get_loc_level(key, -> 1484 drop_level=drop_level) 1485 else: 1486 loc = self.index.get_loc(key) /opt/conda/lib/python3.4/site-packages/pandas/core/index.py in get_loc_level(self, key, level, drop_level) 5553 key = tuple(self[indexer].tolist()[0]) 5554 -> 5555 return (self._engine.get_loc(_values_from_object(key)), 5556 None) 5557 else: pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)() pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)() pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12265)() pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12216)() KeyError: ('a', 'b')

Make the same call after setting the sortlevel

df2 = df.sortlevel(0) print(df2.loc['a', 'b'])

col2 2 1 row2 1 0.142614 0.064209 2 -1.180964 0.885343

Expected Output

The same helpful error message, regardless of using or not using an explicit slicer in the .loc query.

KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (1)'

output of `pd.show_versions()`

pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.4.4.final.0 python-bits: 64 OS: Linux OS-release: 4.2.0-27-generic machine: x86_64 processor: byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 pandas: 0.17.1 nose: None pip: 8.0.2 setuptools: 20.1.1 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.0 statsmodels: None IPython: 4.1.1 sphinx: None patsy: 0.4.0 dateutil: 2.4.2 pytz: 2015.7 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.1 openpyxl: None xlrd: 0.9.4 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None Jinja2: None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

.loc sometimes raises KeyError without an error message when called on an unsorted MultiIndex DataFrame #12660

Code Sample, a copy-pastable example if possible

Expected Output

output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

.loc sometimes raises KeyError without an error message when called on an unsorted MultiIndex DataFrame #12660

Description

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

output of `pd.show_versions()`