-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Code Sample, a copy-pastable example if possible
In [1]: import pandas as pd In [2]: pd.__version__ Out[2]: '0.26.0.dev0+1563.g1feefc692' In [3]: pd.SparseArray Out[3]: pandas.core.arrays.sparse.array.SparseArray In [4]: pd.arrays.SparseArray Out[4]: pandas.core.arrays.sparse.array.SparseArray In [5]: pd.arrays.IntegerArray Out[5]: pandas.core.arrays.integer.IntegerArray In [6]: pd.IntegerArray --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-6-12476104dd13> in <module> ----> 1 pd.IntegerArray C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name) 246 return type(name, (), {}) 247 --> 248 raise AttributeError(f"module 'pandas' has no attribute '{name}'") 249 250 AttributeError: module 'pandas' has no attribute 'IntegerArray' In [7]: pd.arrays.StringArray Out[7]: pandas.core.arrays.string_.StringArray In [8]: pd.StringArray --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-8-86553ff3c48c> in <module> ----> 1 pd.StringArray C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name) 246 return type(name, (), {}) 247 --> 248 raise AttributeError(f"module 'pandas' has no attribute '{name}'") 249 250 AttributeError: module 'pandas' has no attribute 'StringArray'Problem description
I discovered this while working on #30628 . The docs for SparseArray are at the top level (https://dev.pandas.io/docs/reference/api/pandas.SparseArray.html), while the docs for IntegerArray (https://dev.pandas.io/docs/reference/api/pandas.arrays.IntegerArray.html), StringArray(https://dev.pandas.io/docs/reference/api/pandas.arrays.StringArray.html), etc. are at the pandas.arrays level.
In the code SparseArray is at both levels, but IntegerArray, StringArray, etc. is only at the arrays level.
Expected Output
Unsure.
It seems that this should be consistent. Options are:
- Put all
*Arrayclasses at top level, and document them that way. (i.e., use the pattern currently used forSparseArray). That would involve code and documentation changes for all of the arrays exceptSparseArray. - Put all
*Arrayclasses at both levels (likeSparseArray), but document them at thepandas.arrayslevel (likeIntegerArrayandStringArray). That would involve code changes for all of the arrays, and doc changes forSparseArray. - Put all
*Arrayclasses only at thepandas.arrayslevel and document them all that way. That would involve only changing code and docs forSparseArrayand leaving the others alone.
It's not clear to me which is preferred.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 1feefc6
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.26.0.dev0+1563.g1feefc692
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 42.0.2.post20191203
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.0
blosc : None
feather : None
xlsxwriter : 1.2.6
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.10.2
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.2
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.11
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.6
numba : 0.46.0