Possible Duplicate:
Python UnicodeDecodeError - Am I misunderstanding encode?
I have a string that I'm trying to make safe for the unicode() function:
>>> s = " foo “bar bar ” weasel" >>> s.encode('utf-8', 'ignore') Traceback (most recent call last): File "<pyshell#8>", line 1, in <module> s.encode('utf-8', 'ignore') UnicodeDecodeError: 'ascii' codec can't decode byte 0x93 in position 5: ordinal not in range(128) >>> unicode(s) Traceback (most recent call last): File "<pyshell#9>", line 1, in <module> unicode(s) UnicodeDecodeError: 'ascii' codec can't decode byte 0x93 in position 5: ordinal not in range(128) I'm mostly flailing around here. What do I need to do to remove the unsafe characters from the string?
Somewhat related to this question, although I was unable to solve my problem from it.
This also fails:
>>> s ' foo \x93bar bar \x94 weasel' >>> s.decode('utf-8') Traceback (most recent call last): File "<pyshell#13>", line 1, in <module> s.decode('utf-8') File "C:\Python25\254\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 5: unexpected code byte
strhas anencodefunction at all, and whether the "encoding" parameter specifies the result's encoding, or the input's encoding. What exactly are you attempting to do here?u'my unicode str'.encode('ascii','xmlcharrefreplace').