-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugFrequencyDateOffsetsDateOffsetsfreq retentionUser expects "freq" attribute to be preservedUser expects "freq" attribute to be preserved
Description
dti = pd.date_range('2016-01-01', periods=5) dti.value_counts().index.freq # <-- None dti.factorize()[1].freq # <-- None mi = pd.MultiIndex.from_arrays([dti, dti]) mi.levels[0].freq # <-- None There is a comment in tests.indexes.datetimes.test_datetime test_factorize suggesting that freq should be preserved by factorize, but that is not checked and would fail if it were
# freq must be preserved idx3 = date_range("2000-01", periods=4, freq="M", tz="Asia/Tokyo") exp_arr = np.array([0, 1, 2, 3], dtype=np.intp) arr, idx = idx3.factorize() tm.assert_numpy_array_equal(arr, exp_arr) tm.assert_index_equal(idx, idx3) So the question: do we want to try to preserve freq in factorize?
xref #33677 for the MultiIndex case
Update One more: Categorical:
dti = pd.date_range('2016-01-01', periods=5) cat = pd.Categorical(dti) cat.categories.freq # <-- None Metadata
Metadata
Assignees
Labels
AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugFrequencyDateOffsetsDateOffsetsfreq retentionUser expects "freq" attribute to be preservedUser expects "freq" attribute to be preserved