35

Say I create a pandas DataFrame with two columns, b (a DateTime) and c (an integer). Now I want to make a DatetimeIndex from the values in the first column (b):

import pandas as pd import datetime as dt a=[1371215423523845, 1371215500149460, 1371215500273673, 1371215500296504, 1371215515568529, 1371215531603530, 1371215576463339, 1371215579939113, 1371215731215054, 1371215756231343, 1371215756417484, 1371215756519690, 1371215756551645, 1371215756578979, 1371215770164647, 1371215820891387, 1371215821305584, 1371215824925723, 1371215878061146, 1371215878173401, 1371215890324572, 1371215898024253, 1371215926634930, 1371215933513122, 1371216018210826, 1371216080844727, 1371216080930036, 1371216098471787, 1371216111858392, 1371216326271516, 1371216326357836, 1371216445401635, 1371216445401635, 1371216481057049, 1371216496791894, 1371216514691786, 1371216540337354, 1371216592180666, 1371216592339578, 1371216605823474, 1371216610332627, 1371216623042903, 1371216624749566, 1371216630631179, 1371216654267672, 1371216714011662, 1371216783761738, 1371216783858402, 1371216783858402, 1371216783899118, 1371216976339169, 1371216976589850, 1371217028278777, 1371217028560770, 1371217170996479, 1371217176184425, 1371217176318245, 1371217190349372, 1371217190394753, 1371217272797618, 1371217340235667, 1371217340358197, 1371217340433146, 1371217340463797, 1371217340490876, 1371217363797722, 1371217363797722, 1371217363890678, 1371217363922929, 1371217523548405, 1371217523548405, 1371217551181926, 1371217551181926, 1371217551262975, 1371217652579855, 1371218091071955, 1371218295006690, 1371218370005139, 1371218370133637, 1371218370133637, 1371218370158096, 1371218370262823, 1371218414896836, 1371218415013417, 1371218415050485, 1371218415050485, 1371218504396524, 1371218504396524, 1371218504481537, 1371218504517462, 1371218586980079, 1371218719953887, 1371218720621245, 1371218738776732, 1371218937926310, 1371218954785466, 1371218985347070, 1371218985421615, 1371219039790991, 1371219171650043] b=[dt.datetime.fromtimestamp(t/1000000.) for t in a] c = {'b':b, 'c':a[:]} df = pd.DataFrame(c) df.set_index(pd.DatetimeIndex(df['b'])) print df 

Everything seems to work fine, except that when I print the DataFrame, it says that it has an Int64Index.

<class 'pandas.core.frame.DataFrame'> Int64Index: 100 entries, 0 to 99 Data columns (total 2 columns): b 100 non-null values c 100 non-null values dtypes: datetime64[ns](1), int64(1) 

Am I doing something wrong or do I not understand the concept of Indeces properly?

1
  • You are setting the index on a temporary copy of the dataframe and then discarding the copy immediately. Commented Oct 9, 2019 at 20:01

1 Answer 1

61

set_index is not inplace (unless you pass inplace=True). otherwise all correct

In [7]: df = df.set_index(pd.DatetimeIndex(df['b'])) In [8]: df Out[8]: <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 100 entries, 2013-06-14 09:10:23.523845 to 2013-06-14 10:12:51.650043 Data columns (total 2 columns): b 100 non-null values c 100 non-null values dtypes: datetime64[ns](1), int64(1) 

also as a FYI, in forthcoming 0.12 release (next week), you can pass unit=us to specify units of microseconds since epoch

In [13]: pd.to_datetime(a,unit='us') Out[13]: <class 'pandas.tseries.index.DatetimeIndex'> [2013-06-14 13:10:23.523845, ..., 2013-06-14 14:12:51.650043] Length: 100, Freq: None, Timezone: None 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.