BLUF: Why is the decode() method on a bytes object failing to decode ç?
I am receiving a UnicodeDecodeError: 'utf-8' codec can't decode by 0xe7 in position..... Upon tracking down the character, it is the ç character. So when I get to reading the response from the server:
conn = http.client.HTTPConnection(host = 'something.com') conn.request('GET', url = '/some/json') resp = conn.getresponse() content = resp.read().decode() # throws error I am unable to get the content. If I just do content = resp.read() it is successful, I can write to file using wb but then whever the ç is, it is replaced with 0xE7 in the file upon writing. Even if I open the file in Notepad++ and set the encoding to UTF-8, the character only shows as the hex version.
Why am I not able to decode this UTF-8 character from an HTTPResponse? Am I not correctly writing it to file either?
requestsis just a high level API for making the same type of requests. It relies onhttp.clientto make the socket connections anyhow. The example I have shown is somewhat false, as I am really making HTTPS connections andrequestsdoes not support SSL.decode()not work on a valid UTF-8 character?çisb'\xc3\xa7'. The server is sending you CP1252.resp.getheaders()return?