1

I've been looking through every thread that I can find, and the only one that is relevant to this type of formatting issue is here, but it's for java...

How parse 2013-03-13T20:59:31+0000 date string to Date

I've got a column with values like 201604 and 201605 that I need to turn into date values like 2016-04-01 and 2016-05-01. To accomplish this, I've done what is below.

#Create Number to build full date df['DAY_NBR'] = '01' #Convert Max and Min date to string to do date transformation df['MAXDT'] = df['MAXDT'].astype(str) df['MINDT'] = df['MINDT'].astype(str) #Add the day number to the max date month and year df['MAXDT'] = df['MAXDT'] + df['DAY_NBR'] #Add the day number to the min date month and year df['MINDT'] = df['MINDT'] + df['DAY_NBR'] #Convert Max and Min date to integer values df['MAXDT'] = df['MAXDT'].astype(int) df['MINDT'] = df['MINDT'].astype(int) #Convert Max date to datetime df['MAXDT'] = pd.to_datetime(df['MAXDT'], format='%Y%m%d') #Convert Min date to datetime df['MINDT'] = pd.to_datetime(df['MINDT'], format='%Y%m%d') 

To be honest, I can work with this output, but it's a little messy because the unique values for the two columns are...

MAXDT Values ['2016-07-01T00:00:00.000000000' '2017-09-01T00:00:00.000000000' '2018-06-01T00:00:00.000000000' '2017-07-01T00:00:00.000000000' '2017-03-01T00:00:00.000000000' '2018-12-01T00:00:00.000000000' '2017-12-01T00:00:00.000000000' '2019-01-01T00:00:00.000000000' '2018-09-01T00:00:00.000000000' '2018-10-01T00:00:00.000000000' '2016-04-01T00:00:00.000000000' '2018-03-01T00:00:00.000000000' '2017-05-01T00:00:00.000000000' '2018-08-01T00:00:00.000000000' '2017-02-01T00:00:00.000000000' '2016-12-01T00:00:00.000000000' '2018-01-01T00:00:00.000000000' '2018-02-01T00:00:00.000000000' '2017-06-01T00:00:00.000000000' '2018-11-01T00:00:00.000000000' '2018-05-01T00:00:00.000000000' '2019-11-01T00:00:00.000000000' '2016-06-01T00:00:00.000000000' '2017-10-01T00:00:00.000000000' '2016-08-01T00:00:00.000000000' '2018-04-01T00:00:00.000000000' '2016-03-01T00:00:00.000000000' '2016-10-01T00:00:00.000000000' '2016-11-01T00:00:00.000000000' '2019-12-01T00:00:00.000000000' '2016-09-01T00:00:00.000000000' '2017-08-01T00:00:00.000000000' '2016-05-01T00:00:00.000000000' '2017-01-01T00:00:00.000000000' '2017-11-01T00:00:00.000000000' '2018-07-01T00:00:00.000000000' '2017-04-01T00:00:00.000000000' '2016-01-01T00:00:00.000000000' '2016-02-01T00:00:00.000000000' '2019-02-01T00:00:00.000000000' '2019-07-01T00:00:00.000000000' '2019-10-01T00:00:00.000000000' '2019-09-01T00:00:00.000000000' '2019-03-01T00:00:00.000000000' '2019-05-01T00:00:00.000000000' '2019-04-01T00:00:00.000000000' '2019-08-01T00:00:00.000000000' '2019-06-01T00:00:00.000000000' '2020-02-01T00:00:00.000000000' '2020-01-01T00:00:00.000000000'] MINDT Values ['2016-04-01T00:00:00.000000000' '2017-07-01T00:00:00.000000000' '2016-02-01T00:00:00.000000000' '2017-01-01T00:00:00.000000000' '2017-02-01T00:00:00.000000000' '2018-12-01T00:00:00.000000000' '2017-08-01T00:00:00.000000000' '2018-04-01T00:00:00.000000000' '2017-10-01T00:00:00.000000000' '2019-01-01T00:00:00.000000000' '2018-05-01T00:00:00.000000000' '2018-09-01T00:00:00.000000000' '2018-10-01T00:00:00.000000000' '2016-01-01T00:00:00.000000000' '2016-03-01T00:00:00.000000000' '2017-11-01T00:00:00.000000000' '2017-05-01T00:00:00.000000000' '2018-07-01T00:00:00.000000000' '2018-06-01T00:00:00.000000000' '2017-12-01T00:00:00.000000000' '2016-10-01T00:00:00.000000000' '2018-02-01T00:00:00.000000000' '2017-06-01T00:00:00.000000000' '2018-08-01T00:00:00.000000000' '2018-03-01T00:00:00.000000000' '2018-11-01T00:00:00.000000000' '2016-08-01T00:00:00.000000000' '2016-06-01T00:00:00.000000000' '2018-01-01T00:00:00.000000000' '2016-07-01T00:00:00.000000000' '2016-11-01T00:00:00.000000000' '2016-09-01T00:00:00.000000000' '2017-04-01T00:00:00.000000000' '2016-05-01T00:00:00.000000000' '2017-09-01T00:00:00.000000000' '2016-12-01T00:00:00.000000000' '2017-03-01T00:00:00.000000000'] 

I'm trying to build a loop that runs through these dates, and it works, but I don't want to have an index with all of these irrelevant zeros and a T in it. How can I convert these empty timestamp values to just the date that is in yyyy-mm-dd format?

Thank you!

1 Answer 1

1

Unfortunately, I believe Pandas always stores datetime objects as datetime64[ns], meaning the precision has to be like that. Even if you attempt to save as datetime64[D], it will be cast to datetime64[ns].

It's possible to just store these datetime objects as strings instead, but the simplest solution is likely to just strip the extra zeroes when you're looping through them (i.e, using df['MAXDT'].to_numpy().astype('datetime64[D]') and looping through the formatted numpy array), or just reformatting using datetime.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.