-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
SparseSparse Data TypeSparse Data Type
Milestone
Description
I'd like to add a .sparse accessor to DataFrame, to assist with deprecating SparseDataFrame.
It'll contain
- from_spmatrix (part of the SparseDataFrame constructor)
- to_dense (SparseDataFrame.to_dense)
- to_coo (SparseDataFrame.to_coo)
- density
A few design questions:
- When should the
_validateraise?
a. When there are no sparse columns
b. When there is any non-sparse columns
c. Never.
It's slightly easier to implement if we assume everything is sparse.
- Return value of
DataFrame.sparse.density. If we mirrorSparseDataFrame.density, this returns a float. Would it be more useful to return a Series with the density of each column? (and users can.mean()if they want the average density)
I believe that with these methods, the essentially all the functionality of SparseDataFrame will be replicable with a DataFrame of sparse values (the main exception being an expanding __setitem__ creating a sparse column by default; but that's OK to not provide that functionality).
Metadata
Metadata
Assignees
Labels
SparseSparse Data TypeSparse Data Type