0

Why the following code is still use "ascii" to decode the string. Didn't I tell python to use "utf-8" to decode the string? Plus, how come ignore did not work?

print data.encode('utf-8', 'ignore') 

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 12355:

2
  • How you explicitly tell Python to handle a string does not affect what it does implicitly in order to print it. Commented Mar 5, 2015 at 20:26
  • @IgnacioVazquez-Abrams I dont think its the print ... see my answer (I think its right... I dunno string encodings sometimes hang me up too) Commented Mar 5, 2015 at 20:41

1 Answer 1

3

I assume data is a str

print isinstance(data,str)

should probably tell you true

encode wants a unicode so first it tries to decode your str to unicode using the ascii codec

hence why you get the UnicodeDecodeError not UnicodeEncodeError

try

print data.decode("utf-8","ignore") 
Sign up to request clarification or add additional context in comments.

1 Comment

And hence why Python 3 won't automatically convert byte strings to Unicode strings - you'll get a completely different error that is much easier to interpret.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.