0

I am trying to read lines from a jsonl file, but I am getting the following error.

Traceback (most recent call last): File "insertion_script.py", line 12, in for line in f.iter(): File "C:\Users\Administrator\Anaconda3\lib\site-packages\jsonlines\jsonlines.py", line 204, in iter skip_empty=skip_empty) File "C:\Users\Administrator\Anaconda3\lib\site-packages\jsonlines\jsonlines.py", line 143, in read lineno, line = next(self._line_iter) File "C:\Users\Administrator\Anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 886: invalid start byte

BH_data = [] with jsonlines.open('2401659.jsonl','r') as f: for line in f.iter(): BH_data.append(line) 

1 Answer 1

1

The implication is that your data is not actually in UTF-8. 0xA3 happens to be the British pound sterling symbol in the Windows code page. You should try

import codecs with codecs.open('2401659.jsonl','r',encoding='cp1252') as jfile: with jsonlines.Reader(jfile) as f: 
Sign up to request clarification or add additional context in comments.

2 Comments

I tried that, it gives an error of unexpected argument.
There's always a way.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.