1

I get the following string from database:

'23:45 \xe2\x80\x93 23:59' 

and the output should look like

'23:45 - 23:59' 

How can I decode this? I tried utf-8 decoding but no luck

>>> x.decode("utf-8") u'23:45 \u2013 23:59' 

Thank you

3 Answers 3

7

This is completely correct. The interactive python interpreter displaye the repr() of the string. If you want to see it as a proper string, print it:

>>> print '23:45 \xe2\x80\x93 23:59' 23:45 – 23:59 
Sign up to request clarification or add additional context in comments.

3 Comments

Hi ThiefMaster, but how do I get '-' instead of \u2013? is the only option is to user re package?
The same way: with print u'23:45 \u2013 23:59', you get as well the output 23:45 – 23:59.
I want to put this in the variable and when I do x = x.decode("utf-8"), I see in output 'quarter_hour': '23:45 \xe2\x80\x93 23:59' and not 'quarter_hour': '23:45 - 23:59'
1

The UTF-8 representation of an "en dash" http://www.fileformat.info/info/unicode/char/2013/index.htm is hex 0xE2 0x80 0x93 (e28093), or u"\u2013". It sounds like you want to replace the en-dash character with an ascii hyphen/minus (0x2d) to store it in the variable. That's OK, but the variable won't contain the same character that is stored in the database, any more than if you replaced a Ü ( http://www.fileformat.info/info/unicode/char/dc/index.htm ) with an ascii U, or replaced a zero (0x30) with a capital O (0x4f).

1 Comment

See also stackoverflow.com/questions/816285/…, the last answer of which says: "Unidecode looks like a complete solution. It converts fancy quotes to ascii quotes, accented latin characters to unaccented and even attempts transliteration to deal with characters that don't have ASCII equivalents."
1
a="NOV–DEC 2011" (en-dash) b=unidecode(a) #output --> NOV-DEC 2011 (with hyphen) 

You need to install unidecode first, and import it. I've tried it and it runs well. Hope it helps!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.