1

Which is the best alternative to readline in Python?

I want to read a file-like object line by line. This file-like object contains these methods:

  • read(): reads a byte string from the file-like object at the current offset.
  • seek(): seeks to an offset within the file-like object.
  • get_offset(): retrieves the current offset into the file-like object.
  • get_size(): retrieves the size of the file-like object.

I can't read the complete text for file greater than 2GB, so I can't do something like that:

 for line in file_object.read(): dostuff(line) 

I tested some methods to read line by line: these are fine with very small file, but with big file there are so slow. Anyway these are much slower than File readline(). This is an example that I tried:

text = '' while True: char = file_object.read(1) if char == '': return '' text += buffer + char if char == '\n': pos_newline = text.find('\n') current_offset += pos_newline + 1 buffer = text[pos_newline + 1:] line = text[:pos_newline] return line 

I also tried to read 10/50/100 characters at a time.

I can use only Standard Libraries.

EDIT: the "file-like" is not iterable.

8
  • 4
    You don't need .readline(). Directly iterating through file pointer is suitable in the case for line by line iteration. Commented Nov 29, 2018 at 16:56
  • @RockyLi Because it's not a file, it's a "file-like". Commented Nov 29, 2018 at 17:06
  • @Austin File pointer? What do you mean with that? Commented Nov 29, 2018 at 17:07
  • @S.Martinelli, file pointer here is file_object. Commented Nov 29, 2018 at 17:08
  • @Austin The file object is not iterable. Commented Nov 29, 2018 at 18:02

1 Answer 1

1

You can read in chunks with

while True: chunk = f.read(CHUNK_SIZE) if chunk == '': # we've reached the end break # process chunk 

But it's quite likely that any file-like object you have can be iterated over directly to get lines.

for line in f: # I don't have readline, but I work anyway hee hee! 
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the answer. The problem is that the file object is not iterable. This is the file object: github.com/log2timeline/dfvfs/blob/master/dfvfs/file_io/…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.