Skip to content

BUG: DataFrame[Sparse] quantile fails because SparseArray has no reshape  #24600

@jbrockmendel

Description

@jbrockmendel

Tried to simplify Block.quantile by arranging for it to only have to handle 2D case by having Series.quantile dispatch to DataFrame implementation. Ended up getting failures in pandas/tests/series/test_quantile.py test_quantile_sparse

ser = pd.Series([0., None, 1., 2.], dtype='Sparse[float]') df = pd.DataFrame(ser) >>> ser.quantile(0.5) 1.0 >>> ser.quantile([0.5]) 0.5 1.0 dtype: float64 >>> df.quantile(0.5) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pandas/core/frame.py", line 7760, in quantile transposed=is_transposed) File "pandas/core/internals/managers.py", line 500, in quantile return self.reduction('quantile', **kwargs) File "pandas/core/internals/managers.py", line 432, in reduction axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs) File "pandas/core/internals/blocks.py", line 1530, in quantile result = _nanpercentile(values, qs * 100, axis=axis, **kw) File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile mask = mask.reshape(values.shape) AttributeError: 'SparseArray' object has no attribute 'reshape' >>> df.quantile([0.5]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pandas/core/frame.py", line 7760, in quantile transposed=is_transposed) File "pandas/core/internals/managers.py", line 500, in quantile return self.reduction('quantile', **kwargs) File "pandas/core/internals/managers.py", line 432, in reduction axe, block = getattr(b, f)(axis=axis, axes=self.axes, **kwargs) File "pandas/core/internals/blocks.py", line 1511, in quantile axis=axis, **kw) File "pandas/core/internals/blocks.py", line 1484, in _nanpercentile mask = mask.reshape(values.shape) AttributeError: 'SparseArray' object has no attribute 'reshape' 

datetime64[ns, tz] breaks in a slightly different way (presumably all ExtensionBlocks will fail):

dti = pd.date_range('2016-01-01', periods=3, tz='US/Pacific') ser = pd.Series(dti) df = pd.DataFrame(ser) >>> df.quantile(0.5) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pandas/core/frame.py", line 7760, in quantile transposed=is_transposed) File "pandas/core/internals/managers.py", line 500, in quantile return self.reduction('quantile', **kwargs) File "pandas/core/internals/managers.py", line 473, in reduction values = _concat._concat_compat([b.values for b in blocks]) File "pandas/core/dtypes/concat.py", line 174, in _concat_compat return np.concatenate(to_concat, axis=axis) ValueError: need at least one array to concatenate >>> df.quantile([0.5]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pandas/core/frame.py", line 7760, in quantile transposed=is_transposed) File "pandas/core/internals/managers.py", line 500, in quantile return self.reduction('quantile', **kwargs) File "pandas/core/internals/managers.py", line 473, in reduction values = _concat._concat_compat([b.values for b in blocks]) File "pandas/core/dtypes/concat.py", line 174, in _concat_compat return np.concatenate(to_concat, axis=axis) ValueError: need at least one array to concatenate 

xref #24583

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions