-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
As the first step of moving towards integer-na dtypes as the primary integer type, we need to teach infer_dtype that integer-na is a valid inferred type, right now
In [1]: from pandas.api.types import infer_dtype In [3]: infer_dtype([2, 3,4], skipna=False) Out[3]: 'integer' In [4]: infer_dtype([2, 3, 4, np.nan], skipna=False) Out[4]: 'mixed-integer-float' In [5]: infer_dtype([2, 3, 4.2, np.nan], skipna=False) Out[5]: 'mixed-integer-float' [4] could return 'integer-na' to indicate that we might want to infer Int64 dtype and is distinct from the inferred type of [5] which must become float64.
This will allow us to then support changing integer columns when we add nulls to Int64 rather than coerce to float64; this is pretty common in indexing setting operations.
Secondly we can then enable .to_numeric to infer to integer-na (or unsigned-na) and the corresponding dtypes (#26272).
Finally we could support coercion of object dtypes from integers and nulls to coerce to Int64 (#27267 for .explode() and .infer_objects()
This issue itself only is a very minor user facing change (infer_dtype itself).