-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
ExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateStringsString extension data type and string dataString extension data type and string data
Milestone
Description
Code Sample
import pandas as pd s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'], dtype="string") type(s.str.lower()[5]) # returns <class 'float'>Problem description
the str accessor, when working on string-typed series, should return a string-typed series, which should be an array of [string, pd.NA] only, but it seems that some functions (see list below) can return series that contains float('nan').
Affected functions
As of now, I found these str accessor functions to be affected:
upper lower replace
also, extract(expand=False) on a string type series returns an object type series, which seems unintended as well.
Expected Output
<class 'pandas._libs.missing.NAType'>
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit : None python : 3.8.0.final.0 python-bits : 64 OS : Linux OS-release : 5.0.0-38-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_IL LOCALE : en_IL.UTF-8 pandas : 1.0.0rc0 numpy : 1.18.1 pytz : 2019.3 dateutil : 2.8.1 pip : 18.1 setuptools : 40.8.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : None Metadata
Metadata
Assignees
Labels
ExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateStringsString extension data type and string dataString extension data type and string data