Im currently trying to use some simple regex on a very big .txt file (couple of million lines of text). The most simple code that causes the problem:
file = open("exampleFileName", "r") for line in file: pass The error message:
Traceback (most recent call last): File "example.py", line 34, in <module> example() File "example.py", line 16, in example for line in file: File "/usr/lib/python3.4/codecs.py", line 319, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 7332: invalid continuation byte How can i fix this? is utf-8 the wrong encoding? And if it is, how do i know which one is right?
Thanks and best regards!
file -bi [your_filename]. You'll get an encoding. After that provide theencodingargument toopen().