How to stop iterating using itertools.islice when EOF is reached

Question

I would like to use itertools.islice(self._f, 0, 100, None) to read in a file piece by piece (in blocks of 100 lines) as follows:

f = open('test.dat', 'r') while (some condition I look for): f = open(fileName, 'r') x = itertools.islice(f, 0, 100, None) doSomethingWithX(x)

My problem is, I do not know how long the file is and I am looking for a condition to stop the while loop when the end of the file is reached. But I cannot figure out how it is done.

EDIT: Ok, I see the difficulty. Maybe I should reformulate the question when the itertools.islice is capsuled in a class like here:

class reader: def __init__() self._f = open('test.dat', 'r') def getNext(): return itertools.islice(self._f, 0, 100, None) R = reader() while (some condition I look for): x = R.getNext() doSomethingWithX(x)

Could you please include in your question, whether you are trying to iterate over lines or bytes? — Renae Lider
– Renae Lider, Commented Aug 16, 2015 at 20:56
I have edited my code to fetch 100 lines per iteration. Is that what you were looking for? — Renae Lider
– Renae Lider, Commented Aug 16, 2015 at 21:31

Padraic Cunningham · Accepted Answer · 2015-08-16 21:05:13Z

If you don't mind getting list slices, you can use iter:

with open(filename, 'r') as f: for x in iter(lambda: list(itertools.islice(f, 100)), []): doSomethingWithX(x)

Not sure which file you are using as you have f = .. twice and have self_.f in there too.

Using your edited code:

class reader: def __init__(self): self._f = open('out.csv', 'r') def getNext(self): return itertools.islice(self._f, 100) R = reader() import itertools for x in iter(lambda: list(R.getNext()),[]): print(x)

using a test file with the following and your class code using itertools.islice(self._f, 2):

1 2 3 4 5 6 7 8 9 10

outputs:

In [15]: R = reader() In [16]: import itertools In [17]: for x in iter(lambda: list(R.getNext()),[]): ....: print(x) ....: ['1\r\n', '2\r\n'] ['3\r\n', '4\r\n'] ['5\r\n', '6\r\n'] ['7\r\n', '8\r\n'] ['9\r\n', '10']

@andi, the answer will applies, this will take 100 lines at a time until you have exhausted the iterator i.e got to the end of the file, the only issue is whether you want a list of what exactly you want to do with x
Maybe I am to stu*** to see it :( I will give it a shot, but might take a while :D. Thanks already for your time.
using [] with iter is a sentinel value, we call list on the islice object so when we get to the end there will be an empty list so the loop ends, you will get 100 lines in a list each iteration until the end
I do not get it to work in the class, sorry. Would you be able to include it in the class? Sorry for the troubles.

user4290866 · Accepted Answer · 2015-08-17 10:20:38Z

So what I was looking for was something like this:

class reader: def __init__() self._f = open('test.dat', 'r') self._f.seek(0, os.SEEK_END) # find EOF self._EOF = self._f.tell() self._f.seek(0) # go back to beginning def getNext(): if self._f.tell() != self._EOF: x = np.genfromtxt(itertools.islice(self._f, 0, self._chunkSizes, None), dtype=np.float64) return x else: return [] R = reader() x = R.getNext() while (x != []): doSomethingWithX(x) x = R.getNext()

Renae Lider · Accepted Answer · 2015-08-16 21:48:04Z

You can use the readline method to easily process the chunks of 100 lines. Do as follows:

def read_chunks(f, chunks=100): block = True while block: block = [f.readline() for i in range(chunks)] block = list(filter(None, block)) yield block with open("filename") as f: for lines in read_chunks(f): print(len(lines), lines)

Collectives™ on Stack Overflow

How to stop iterating using itertools.islice when EOF is reached

3 Answers 3

11 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

11 Comments

Comments

Comments

Linked

Related