10

I have a file where each row has this format:

YYYY-MM-DD-HH-MM-SS uint64 float64 float64 uint64 

I've read it with:

pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True, parse_dates=True, infer_datetime_format=True) 

The datetimes constructed are not correct. Can I specify the exact format?

3 Answers 3

18

You can pass a function that parses the correct format to the date_parser kwarg of read_csv, but another option is to not parse the dates when reading, but afterwards with to_datetime (this functions allows to specify a format, and will be faster than a custom date_parser function):

df = pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True) df.index = pd.to_datetime(df.index, format="%Y-%m-%d-%H-%M-%S") 
Sign up to request clarification or add additional context in comments.

Comments

8

I have found this method.

f = lambda s: datetime.datetime.strptime(s,'%Y-%m-%d-%H-%M-%S') pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True, date_parser=f) 

that worked

Comments

5

New shorter solution

Since Pandas 2.0.0 there is a direct way to import dates with specific formats using parse_dates to specify the date-columns and date_format to specify the format.

Example csv

Based on your import I created the example file test.csv with the following content:

datetime float_col int_col 2023-09-14-15-00-00 13.2 7 2023-09-14-15-12-03 13.4 8 

Direct one-line-solution - Import statement:

df = pd.read_csv('test.csv', sep=' ', parse_dates = [0], date_format = '%Y-%m-%d-%H-%M-%S') 

Import result:

enter image description here

Resulting dtypes:

datetime datetime64[ns] float_col float64 int_col int64 

Explanation

parse_dates is a list of column positions with dates. Since the date column is at the 1st position. It is the position 0. For date_format the usual Python format definitions are used.

4 Comments

When using this method, i get read_csv() got an unexpected keyword argument 'date_format'
@thetangler Did you check your Pandas version? I wrote "since Pandas 2.0.0".
My version is Python 3. I’m assuming you meant that this method will work from Python 2 onwards?
@thetangler: No! It is about the "pandas" and not the "python"-version.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.