Skip to content

[0.24.0rc1] passing dtype='M8' to Index raises #24753

@TomAugspurger

Description

@TomAugspurger

In 0.23.4

In [1]: import pandas as pd In [2]: pd.Index([], dtype='M8') Out[2]: DatetimeIndex([], dtype='datetime64[ns]', freq=None)

In 0.24.0rc1

In [2]: pd.Index([], dtype='M8')
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-09844a8ae29c> in <module> ----> 1 pd.Index([], dtype='M8') ~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs) 306 else: 307 result = DatetimeIndex(data, copy=copy, name=name, --> 308 dtype=dtype, **kwargs) 309 return result 310 ~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in __new__(cls, data, freq, start, end, periods, tz, normalize, closed, ambiguous, dayfirst, yearfirst, dtype, copy, name, verify_integrity) 301 data, dtype=dtype, copy=copy, tz=tz, freq=freq, 302 dayfirst=dayfirst, yearfirst=yearfirst, ambiguous=ambiguous, --> 303 int_as_wall_time=True) 304 305 subarr = cls._simple_new(dtarr, name=name, ~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _from_sequence(cls, data, dtype, copy, tz, freq, dayfirst, yearfirst, ambiguous, int_as_wall_time) 366 data, dtype=dtype, copy=copy, tz=tz, 367 dayfirst=dayfirst, yearfirst=yearfirst, --> 368 ambiguous=ambiguous, int_as_wall_time=int_as_wall_time) 369 370 freq, freq_infer = dtl.validate_inferred_freq(freq, inferred_freq, ~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in sequence_to_dt64ns(data, dtype, copy, tz, dayfirst, yearfirst, ambiguous, int_as_wall_time) 1704 inferred_freq = None 1705 -> 1706 dtype = _validate_dt64_dtype(dtype) 1707 1708 if not hasattr(data, "dtype"): ~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _validate_dt64_dtype(dtype) 1991 raise ValueError("Unexpected value for 'dtype': '{dtype}'. " 1992 "Must be 'datetime64[ns]' or DatetimeTZDtype'." -> 1993 .format(dtype=dtype)) 1994 return dtype 1995 ValueError: Unexpected value for 'dtype': 'datetime64'. Must be 'datetime64[ns]' or DatetimeTZDtype'.

We want that to raise eventually; ns precision should be specified. But we should maybe deprecate the old behavior first?

The best place to do it is probably before we get to arrays, so in DatetimeIndex.__new__ we can check for M8 specifically, warn, then pass through M8[ns].

We should also check

  • np.dtype("M8")
  • 'm8'
  • np.dtype('m8')

It's less clear what we should do for something like M8[us]. In the past, we used to ignore the precision

In [15]: pd.Index(list(pd.date_range('2000', periods=4).asi8), dtype='M8[us]') Out[15]: DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq=None)

this should arguably raise now...

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions