0

y0 I have this problem that characters that include ñ or ŕ í á ú etc are discarded when I apply

text = text.encode('ascii', 'ignore') 

to a function that needs the input to be ascii.

is there a way to force it to ascii without losing those characters or should I change the function to accept unicode characters?

http://dpaste.com/601417/

2
  • What function? Why does it "need the input to be ascii"? Commented Aug 23, 2011 at 22:24
  • 3
    Use normalization, then throw away the diacritics: stackoverflow.com/questions/175240/… Commented Aug 23, 2011 at 23:18

2 Answers 2

5

The 'ascii' encoding can't represent the characters you refer to. You have to choose a different encoding — perhaps 'cp850' or 'latin_1' — but then you have to be sure that your output terminal interprets 8-bit codes using the relevant code page.

On balance, life is easier if you just go Unicode all the way.

Sign up to request clarification or add additional context in comments.

Comments

0

Yes, you should go for another encoding, if you need those characters (for example Unicode). See ascii table for all chars that are included in ascii.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.