Skip to content

BUG: PyArrow string: list indexing and negative indices fail after str.split() #63221

@Elliottt-Chen

Description

@Elliottt-Chen

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd from pandas.core.frame import Series print(pd.__version__) # '3.0.0.dev0+2752.ga885d67965' s: Series = pd.Series(["A-B", "C-D"]).convert_dtypes(dtype_backend="pyarrow") s.str.split("-") > # 0 ['A' 'B'] # 1 ['C' 'D'] s.str.split("-").str[0] # AttributeError: Can only use .str accessor with string values, not unknown-array.  # Did you mean: 'std'? s.str.split("-").list[0] > # 0 A # 1 C s.str.split("-").list[-1] > # pyarrow.lib.ArrowInvalid: Index -1 is out of bounds: should be greater than or  # equal to 0

Issue Description

I noticed that the latest nightly wheel (pandas 3.0.0.dev0+2752.ga885d67965) does not allow indexing into list elements produced by str.split() when using the pyarrow-backed string dtype.

s1 = Series(...).convert_dtype(dtype_backend="pyarrow")
s1.str.split().str[...] # attempting to access list items via str accessor
s1.str.split().list[-1] # negative index access via list accessor

Specifically, list element access via slice syntax (str[...]) and negative indexing (list[-1]) appear to be unsupported for pyarrow string arrays in the current nightly build.

The error message went like this:

  1. AttributeError: Can only use .str accessor with string values, not unknown-array. Did you mean: 'std'? (str accessor)
  2. pyarrow.lib.ArrowInvalid: Index -1 is out of bounds: should be greater than or equal to 0 (list accessor)

Expected Behavior

s = s.astype(pd.StringDtype())

s.str.split("-").str[-1]

0 B
1 D
dtype: object

Installed Versions

INSTALLED VERSIONS

commit : a885d67
python : 3.14.0
python-bits : 64
OS : Darwin
OS-release : 24.6.0
Version : Darwin Kernel Version 24.6.0: Wed Oct 15 21:09:41 PDT 2025; root:xnu-11417.140.69.703.14~1/RELEASE_ARM64_T8122
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 3.0.0.dev0+2752.ga885d67965
numpy : 2.3.5
dateutil : 2.9.0.post0
pip : 25.3
Cython : None
sphinx : None
IPython : 9.7.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.14.2
bottleneck : None
fastparquet : 2024.11.0
fsspec : 2025.10.0
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : None
matplotlib : 3.10.7
numba : None
numexpr : 2.14.1
odfpy : None
openpyxl : 3.1.5
psycopg2 : None
pymysql : None
pyarrow : 22.0.0
pyiceberg : 0.10.0
pyreadstat : 1.3.2
pytest : None
python-calamine : None
pytz : 2025.2
pyxlsb : 1.0.10
s3fs : None
scipy : 1.16.3
sqlalchemy : 2.0.44
tables : 3.10.2
tabulate : 0.9.0
xarray : 2025.11.0
xlrd : 2.0.2
xlsxwriter : 3.2.9
zstandard : None
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityBugNested DataData where the values are collections (lists, sets, dicts, objects, etc.).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions