BUG: Fix __getitem__ KeyError for np.True_/np.False_ column keys by alvinttang · Pull Request #64822 · pandas-dev/pandas

alvinttang · 2026-03-24T07:54:29Z

Summary

The fix in #64639 added PyBool_Check(a) != PyBool_Check(b) to the pyobject_cmp function in khash_python.h to distinguish Python bool from Python int. However, np.True_ and np.False_ are numpy bool scalars — PyBool_Check(np.True_) returns 0 — so they were incorrectly treated as unequal to Python True/False, breaking hash-table lookups for columns created with numpy bool keys.

Root cause: PyBool_Check(True) = 1 but PyBool_Check(np.True_) = 0, so PyBool_Check(a) != PyBool_Check(b) fires and returns 0 (not equal) even though np.True_ == True should hold.

Fix: Narrow the guard by also requiring PyLong_CheckExact(a) || PyLong_CheckExact(b). This ensures the short-circuit only applies when the non-bool side is a Python int. Numpy bool scalars are neither PyBool nor PyLong, so they fall through to PyObject_RichCompareBool, which correctly handles np.True_ == True.

Changes

pandas/_libs/include/pandas/vendored/klib/khash_python.h: add PyLong_CheckExact guard to the PyBool_Check condition
pandas/tests/frame/indexing/test_getitem.py: add TestGetitemNumpyBool with 4 tests
doc/source/whatsnew/v3.1.0.rst: add entry under Indexing bugs

Test plan

test_getitem_nptrue_column_with_true — df[True] on a {True: ...} DataFrame works
test_getitem_nptrue_column_after_concat — exact repro from the issue (concat then index)
test_getitem_npfalse_column_with_false — same for np.False_/False
test_getitem_bool_int_still_distinct — GH#62888 regression guard: Python True and 1 remain distinct column keys

🤖 Generated with Claude Code

…64749) The fix in GH#62888 added a PyBool_Check guard to distinguish Python bools from Python ints in the object-dtype hash table. However, that check also prevented equality between np.True_ (a numpy bool scalar, not a PyBool) and Python True, breaking DataFrame.__getitem__ for columns created with numpy bool keys. Narrow the guard so it only fires when the non-bool side is a Python int (PyLong_CheckExact). Numpy bool scalars fall through to PyObject_RichCompareBool, which correctly returns True for np.True_ == True. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jbrockmendel · 2026-03-24T14:55:08Z

doc/source/whatsnew/v3.1.0.rst

 - Bugs in setitem-with-expansion when adding new rows failing to keep the original dtype in some cases (:issue:`32346`, :issue:`15231`, :issue:`47503`, :issue:`6485`, :issue:`25383`, :issue:`52235`, :issue:`17026`, :issue:`56010`)
 - Bug in :meth:`Index.get_level_values` mishandling boolean, NA-like (``np.nan``, ``pd.NA``, ``pd.NaT``) and integer index names (:issue:`62169`)
 - Bug in :meth:`MultiIndex.loc` returning incorrect results when indexing with :class:`numpy.datetime64` on a level containing :class:`datetime.date` objects (:issue:`55969`)
+- Bug in :meth:`DataFrame.__getitem__` raising ``KeyError`` when a column was created with a ``numpy`` bool scalar (e.g. ``np.True_``) and accessed with a Python ``bool`` (e.g. ``True``) (:issue:`64749`)


the bug is not present in a released version, does not need a whatsnew

jbrockmendel · 2026-03-24T14:55:54Z

pandas/_libs/include/pandas/vendored/klib/khash_python.h

 // frozenset isn't yet supported
- } else if (PyBool_Check(a) != PyBool_Check(b)) {
+ } else if (PyBool_Check(a) != PyBool_Check(b) &&
+ (PyLong_CheckExact(a) || PyLong_CheckExact(b))) {


better to explicitly catch cnp.bool_?

jbrockmendel reviewed Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: Fix getitem KeyError for np.True_/np.False_ column keys#64822

BUG: Fix getitem KeyError for np.True_/np.False_ column keys#64822
alvinttang wants to merge 1 commit intopandas-dev:mainfrom
alvinttang:fix/getitem-nptrue-column

alvinttang commented Mar 24, 2026

jbrockmendel Mar 24, 2026

jbrockmendel Mar 24, 2026

Labels

2 participants

Uh oh!

Conversation

alvinttang commented Mar 24, 2026

Summary

Changes

Test plan

jbrockmendel Mar 24, 2026

Choose a reason for hiding this comment

jbrockmendel Mar 24, 2026

Choose a reason for hiding this comment

Labels

2 participants