Skip to content

Deprecate Series._from_array ? #19883

@jaumebonet

Description

@jaumebonet

I open this suggestion as per @jorisvandenbossche's recommendation.

This issue follows in the steps of #18213 and #19850.

As it is commented in #18213, _from_array has a single difference with the Series constructor, how it handles SparseArrays:

 # return a sparse series here if isinstance(arr, ABCSparseArray): from pandas.core.sparse.series import SparseSeries cls = SparseSeries

This process could be achieved in a similar way in Series.__new__; something on the lines of:

def __new__( cls, *args, **kwargs ): # arr is mandatory, first argument or key `arr`. if isinstance(kwargs.get('arr', args[0]), ABCSparseArray): from pandas.core.sparse.series import SparseSeries cls = SparseSeries obj = object.__new__(cls) obj.__init__(*args, **kwargs) return obj

What's the issue?

As @jorisvandenbossche pointed out, a change like this will result in a change of the API, as this:

>>> s = pd.Series(pd.SparseArray([1, 0, 0, 2, 0])) >>> type(s) <class 'pandas.core.series.Series'>

will become this:

>>> s = pd.Series.from_array(pd.SparseArray([1, 0, 0, 2, 0])) >>> type(s) <class 'pandas.core.sparse.series.SparseSeries'>

I'm not familiar with sparse data structures, but according to the docs all functionality is kept between Series and SparseSeries. Furthermore, a simple

>>> s = s.to_dense() >>> type(s) <class 'pandas.core.series.Series'>

should do it to go back to Series.

Why change it, then?

Currently, Series._from_array is called only inside two functions: DataFrame._idxand DataFrame. _box_col_values. With the proposed change, those calls could be substituted by the default constructor.
Being that the case, when working with panda's subclassing, one would be able to declare complex _constructor_slice such as this:

 @property def _constructor_sliced(self): def f(*args, **kwargs): # adapted from https://github.com/pandas-dev/pandas/issues/13208#issuecomment-326556232 return DerivedSeries(*args, **kwargs).__finalize__(self, method='inherit') return f

, which would allow for a more complex relationship between the subclassed DataFrame and its sliced version, including the transfer of metadata according to the user's specification in __finalize__.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions