I am trying to seek to a token ':path,' in a file, then read all the following (arbitrary digit count) numbers as a number (so for ':path,123' I seek to the , in file then read the integer 123). Then read the chars between the current seek position and pos+123 (store them in a list or whatever). Then seek until the next match for ':path,' and repeat the process.
I would like a function a bit like:
def fregseek(FILE, current_seek, /regex/): . . value_found = ? # result of reading next N chars after :path,[0-9]+ . . return next_start_seek, value_found
There may be any number of matches for ':path,' in a line, and that string may occur within the number of chars specified after ','. I have written a messy bunch of rubbish which reads in each line, then for each line chomps of the first N chars indicated by the match, then continues processing the string until it is all eaten up. Then reads the next string and so on.
This is horrible, I do not want to have to slurp off all the lines from a potentially huge file when all I really need to do is seek (especially since a newline is irrelevant, so having an extra processing step just because lines are easy to pull from files is ridiculous).
So, there it is, that is my problem that I would like to solve. I need to seek to a match, read a value, continue from the end of that value looking for the next match and so on until the file is exhausted.
If anybody can help me with this I will be happy to hear from them :)
I would like to avoid non-standard libraries if possible, I would also like the shortest code but this is the least of my concerns (speed and memory consumption are the important factors, but I don't want 50 loc extra just to bootstrap some library with a small funciton in it I could just rip out if only I knew what it was).
I would prefer python code, however, if perl beats python in this regard I will use perl instead, I am also open to clever sed/awk/bash scripts etc as long as they are not horribly slower.
Thanks very much in advance.
str/bytes, but using anmmapobject instead.