Skip to content
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@ Bug fixes
~~~~~~~~~
- Bug in :class:`AbstractHolidayCalendar` where timezone data was not propagated when computing holiday observances (:issue:`54580`)
- Bug in :class:`pandas.core.window.Rolling` where duplicate datetimelike indexes are treated as consecutive rather than equal with ``closed='left'`` and ``closed='neither'`` (:issue:`20712`)
- Bug in :func:`is_all_strings` while checking object array with no elements is of the string dtype (:issue:`54661`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the bug in is_all_strings or is_string_dtype?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, the bug is in is_all_strings, but we don’ t have this function in api and users see the bug while calling is_string_dtype. That is why I replaced Bug in :func:is_all_strings with: Bug in :func:is_string_dtype.

Looks like we don’t have a test for is_all_strings, do we need one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant the user-facing bug. in the docs we generally only reference public functions/methods

Copy link
Contributor Author

@natmokval natmokval Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, thank you. I removed from doc/source/whatsnew/v2.2.0.rst my line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel, I updated the PR. Could you please take a look at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we still want a note, it just needs to refer to is_string_dtype, which I think is public

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, is_string_dtype is a public function, I added the note to v2.1.1.rst. I think my change is a minor and we can put it in the version 2.1.1


Categorical
^^^^^^^^^^^
Expand Down
9 changes: 6 additions & 3 deletions pandas/core/dtypes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1664,9 +1664,12 @@ def is_all_strings(value: ArrayLike) -> bool:
dtype = value.dtype

if isinstance(dtype, np.dtype):
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
if len(value) == 0:
return dtype == np.dtype("object")
else:
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
elif isinstance(dtype, CategoricalDtype):
return dtype.categories.inferred_type == "string"
return dtype == "string"
Expand Down
5 changes: 5 additions & 0 deletions pandas/tests/dtypes/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1212,3 +1212,8 @@ def test_multi_column_dtype_assignment():

df["b"] = 0
tm.assert_frame_equal(df, expected)


def test_empty_object_array_is_string_dtype():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm looks like we have tests for is_string_dtype in a couple of places. one of these days might be worth checking if there is a reason for that and if not, refactoring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment. I think, I found a better place for my test: pandas/tests/dtypes/test_common.py. I combined my test with another one and added a parameterization.

# GH #54661
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick can you remove the space after GH. i like the pattern GH#12345 bc it is something i grep for

Copy link
Contributor Author

@natmokval natmokval Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, done. I agree, it's better to have all GH refers look alike

assert is_string_dtype(pd.Index([], dtype="O"))