UnicodeDecodeError (UTF-8) for JSON

Question

BLUF: Why is the decode() method on a bytes object failing to decode ç?

I am receiving a UnicodeDecodeError: 'utf-8' codec can't decode by 0xe7 in position..... Upon tracking down the character, it is the ç character. So when I get to reading the response from the server:

conn = http.client.HTTPConnection(host = 'something.com') conn.request('GET', url = '/some/json') resp = conn.getresponse() content = resp.read().decode() # throws error

I am unable to get the content. If I just do content = resp.read() it is successful, I can write to file using wb but then whever the ç is, it is replaced with 0xE7 in the file upon writing. Even if I open the file in Notepad++ and set the encoding to UTF-8, the character only shows as the hex version.

Why am I not able to decode this UTF-8 character from an HTTPResponse? Am I not correctly writing it to file either?

@kichik No need. requests is just a high level API for making the same type of requests. It relies on http.client to make the socket connections anyhow. The example I have shown is somewhat false, as I am really making HTTPS connections and requests does not support SSL. — user8371266
– user8371266, Commented Nov 6, 2017 at 18:11
@kichik Further, the real question is why does decode() not work on a valid UTF-8 character? — user8371266
– user8371266, Commented Nov 6, 2017 at 18:12
The server doesn't seem to send you actual UTF-8. I was hoping requests will do better at detecting that. The actual UTF-8 representation for ç is b'\xc3\xa7'. The server is sending you CP1252. — kichik
– kichik, Commented Nov 6, 2017 at 18:18

Brian M. Sheldon · Accepted Answer · 2017-11-06 18:25:57Z

When you have issues with encoding/decoding, you should take a look at the UTF-8 Encoding Debugging Chart.

If you look in the chart for the Windows 1252 code point 0xE7 you find the expected character is ç showing that the encoding is CP1252.

Collectives™ on Stack Overflow

UnicodeDecodeError (UTF-8) for JSON

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related