As it follows from your question, I assume that you use Python 2.7.
The reason of the error is:
- Your source code is not in UTF-8 and almost certainly in cp1252.
- In cp1252 the 'Œ' character is the byte '\x8c', and that byte is not valid in UTF-8.
- You specified UTF-8 as the encoding to decode your string in 'except' part.
For better understanding look at that:
>>> u = '\x8c'.decode('cp1252') >>> u u'\u0152'
So, when we decode '\x8c' byte with cp1252, there is the Unicode code point, which is:
>>> import unicodedata >>> unicodedata.name(u) 'LATIN CAPITAL LIGATURE OE'
However, if we try to decode with UTF-8, we'll get an error:
>>> u = '\x8c'.decode('utf-8') ... UnicodeDecodeError: 'utf8' codec can't decode byte 0x8c ...
So, '\x8c' byte and UTF-8 encoding are incompatible.
To fix the problem you can try this:
each = str(each.decode('cp1252').encode('ascii', errors='ignore'))
Or this:
each = str(each.decode('utf-8', errors='ignore').encode('ascii', errors='ignore'))
Also in your case you can use ord():
my_str = 'DD-XBS 2 1/2x 17 LCLξ 3-pack' ascii_str = '' for sign in my_str: if ord(sign) < 128: ascii_str += sign print(ascii_str) # DD-XBS 2 1/2x 17 LCL 3-pack
But possibly the best solution is just to convert your source to UTF-8.