0

Im having issuse with my email validator tool, it wont suddenly decode.

I have this error:

 File "C:\Users\vk662\OneDrive - ST\Skrivebord\test\email_check.py", line 70, in <module> for row in csv_reader: File "C:\Program Files\Python310\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte 

here is the code: on line 70;

 email_list = [] with open('email_in/test.csv', 'r', encoding='utf-8') as read_obj: csv_reader = csv.reader(read_obj, delimiter=';') for row in csv_reader: if (row): result = email_check(row[0],email_list) if result["Email ok"]: email_list.append(row[0]) if result["Email ok"]: email_ok.append(row[0]) else: str = "~~" for x, y in result.items(): if y: str += x + "~~" if x == "Duplicate email" and y: if row[0] in email_ok: email_ok.remove(row[0]) email_error.append(row[0] + str) 

check image below: https://i.sstatic.net/LE2fS.jpg

3
  • 1
    It seems that the CSV file has another encoding as "utf-8". Commented Oct 13, 2022 at 11:40
  • @MichaelButscher how can i fix that? Commented Oct 13, 2022 at 12:25
  • Find out which encoding is used. If most of the file is encoded in Ascii, you can instead add the argument errors='replace' to the open call. This will replace unrecognized bytes by question marks. Commented Oct 13, 2022 at 18:28

1 Answer 1

1

As Michael Butscher said, I also suggest you to use different encodings. Try with 'latin-1' or 'cp1252', which are major encodings in Western countries.

Another solution is to save your .csv file with utf-8 encoding. Then opening csv part probably work. To do this, google 'how to save excel csv file as utf-8'. (My excel shows non-English, so I cannot exactly tell you each step. English guidelines will help you.)

If you made .csv file with MS excel, then it might be the reason. excel usually saves file with non utf-8 encodings.

This paragraph is why you got the error. Only hex values between 0x80 and 0xBF can be used as non-first-byte in multibyte utf-8 representation. 0xe5 is out of range. This is why computer returned 'invalid continuation byte'

Sign up to request clarification or add additional context in comments.

2 Comments

Already tried Latin-1 and also changed something else that didnt get recognized. now its working.
Great to hear that.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.