BUG: crosstab(dropna=False) does not work like value_counts(dropna=False)

If one uses series.value_counts(dropna=False) it includes NaN values, but crosstab(series1, series2, dropna=False) does not include NaN values. I think of cross_tab kind of like 2-dimensional value_counts(), so this is surprising.

In [31]: x = pd.Series(['a', 'b', 'a', None, 'a']) In [32]: y = pd.Series(['c', 'd', 'd', 'c', None]) In [33]: y.value_counts(dropna=False) Out[33]: d 2 c 2 NaN 1 dtype: int64 In [34]: pd.crosstab(x, y, dropna=False) Out[34]: c d row_0 a 1 1 b 0 1

I believe what crosstab(..., dropna=False) really does is just include rows/columns that would otherwise have all zeros, but that doesn't seem the same as dropna=False to me. I would have expected to see a row and column entry for NaN, something like:

 c d NaN row_0 a 1 1 1 b 0 1 0 NaN 1 0 0

Thoughts on this?

This is on current master (almost 0.17).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: crosstab(dropna=False) does not work like value_counts(dropna=False) #10772

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG: crosstab(dropna=False) does not work like value_counts(dropna=False) #10772

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions