Is there any way to fix Unicodedecodeerror: 'utf-8' - python

Question

Im having issuse with my email validator tool, it wont suddenly decode.

I have this error:

 File "C:\Users\vk662\OneDrive - ST\Skrivebord\test\email_check.py", line 70, in <module> for row in csv_reader: File "C:\Program Files\Python310\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte

here is the code: on line 70;

 email_list = [] with open('email_in/test.csv', 'r', encoding='utf-8') as read_obj: csv_reader = csv.reader(read_obj, delimiter=';') for row in csv_reader: if (row): result = email_check(row[0],email_list) if result["Email ok"]: email_list.append(row[0]) if result["Email ok"]: email_ok.append(row[0]) else: str = "~~" for x, y in result.items(): if y: str += x + "~~" if x == "Duplicate email" and y: if row[0] in email_ok: email_ok.remove(row[0]) email_error.append(row[0] + str)

check image below: https://i.sstatic.net/LE2fS.jpg

Find out which encoding is used. If most of the file is encoded in Ascii, you can instead add the argument errors='replace' to the open call. This will replace unrecognized bytes by question marks. — Michael Butscher
– Michael Butscher, Commented Oct 13, 2022 at 18:28

JHyun Ahn · Accepted Answer · 2022-10-17 17:07:48Z

As Michael Butscher said, I also suggest you to use different encodings. Try with 'latin-1' or 'cp1252', which are major encodings in Western countries.

Another solution is to save your .csv file with utf-8 encoding. Then opening csv part probably work. To do this, google 'how to save excel csv file as utf-8'. (My excel shows non-English, so I cannot exactly tell you each step. English guidelines will help you.)

If you made .csv file with MS excel, then it might be the reason. excel usually saves file with non utf-8 encodings.

This paragraph is why you got the error. Only hex values between 0x80 and 0xBF can be used as non-first-byte in multibyte utf-8 representation. 0xe5 is out of range. This is why computer returned 'invalid continuation byte'

Already tried Latin-1 and also changed something else that didnt get recognized. now its working.

Collectives™ on Stack Overflow

Is there any way to fix Unicodedecodeerror: 'utf-8' - python

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related