python read data from file

Question

I've a very simple question: which is the most efficient way to read different entries from a txt file with Python?

Suppose I've a text file like:

42017 360940084.621356 21.00 09/06/2015 13:08:04 42017 360941465.680841 29.00 09/06/2015 13:31:05 42017 360948446.517761 16.00 09/06/2015 15:27:26 42049 361133954.539315 31.00 11/06/2015 18:59:14 42062 361208584.222483 10.00 12/06/2015 15:43:04 42068 361256740.238150 19.00 13/06/2015 05:05:40

In C I would do:

while(fscanf(file_name, "%d %lf %f %d/%d/%d %d:%d:%d", &id, &t0, &score, &day, &month, &year, &hour, &minute, &second) != EOF){...some instruction...}

What would be the best way to do something like this in Python? In order to store every value into a different variable (since I've got to work with those variables throughout the code).

Thanks in advance!

possible duplicate of Python fastest way to read a large text file (several GB) — user3636636
– user3636636, Commented Jul 21, 2015 at 9:53
Do you want a list of string or a list of types depending to the column ? — FunkySayu
– FunkySayu, Commented Jul 21, 2015 at 9:53

FunkySayu · Accepted Answer · 2015-07-21 11:34:30Z

2

I feel like the muddyfish answer is good, here is another way (maybe a bit lighter)

import time with open(file) as f: for line in f: identifier, t0, score, date, hour = line.split() # You can also get a time_struct from the time timer = time.strptime(date + hour, "%d/%m/%Y%H:%M:%S")

edited Jul 21, 2015 at 11:34

answered Jul 21, 2015 at 9:57

FunkySayu

8,23111 gold badges44 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

muddyfish Over a year ago

note that id is a reserved word. If you want to use it as an identifier, use id_ = value instead

urgeo Over a year ago

Thanks FunkySayu! I also ended up to something similar... since I need each single entry (day, month, year, etc.), I was wondering whether there is a faster way or do I have to use line.split("/") and line.split(":") another time?

urgeo Over a year ago

The point is that I've got to work with each single entry (like make operations with the t0 and the different days and months), so I need to store data into different variables

muddyfish · Accepted Answer · 2015-07-21 09:55:04Z

I would look up the string.split() method

For example you could use

for line in file.readlines(): data = dict(zip(("id", "t0", "score", "date", "time"), line.split(" "))) instructions()

Community · Accepted Answer · 2017-05-23 12:21:42Z

Depending on what you want to do with the data, pandas may be something to look into:

import pandas as pd with open(file_name) as infile: df = pd.read_fwf(infile, header=None, parse_dates=[[3, 4]], date_parser=lambda x: pd.to_datetime(x, format='%d/%m/%Y %H:%M:%S'))

The double list [[3, 4]], together with the date_parser argument, will read the the third and fourth (0-indexed) columns as a single data-time object. You can then access individual parts of that column with

>>> df['3_4'].dt.hour 0 13 1 13 2 15 3 18 4 15 5 5 dtype: int64

(If you don't like the '3_4' key, use the parse_dates argument above as follows:

parse_dates={'timestamp': [3, 4]}

)

read_fwf is for reading fixed width columns, which your data seems to adhere to. Alternatively, there are functions such as read_csv, read_table and a lot more.

(This answer is pretty much a duplicate of this SO answer, but since this question here is more general, I've put this here as another answer, not as a comment.)

Collectives™ on Stack Overflow

python read data from file

3 Answers 3

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Linked

Related