I have a web crawler that get a lot of these errors:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe1' in position 27: ordinal not in range(128) To mitigate these errors I have implemented a function that encode them like this:
def properEncode(url): url = url.replace("ø", "%C3%B8") url = url.replace("å", "%C3%A5") url = url.replace("æ", "%C3%A6") url = url.replace("é", "%c3%a9") url = url.replace("Ø", "%C3%98") url = url.replace("Å", "%C3%A5") url = url.replace("Æ", "%C3%85") url = url.replace("í", "%C3%AD") return url These are based on this table: http://www.utf8-chartable.de/
The conversion I do seems to be to convert them to utf-8 hex? Is there a python function to do this automatically?