0

I have a variable in which Unicode characters are typed with string

print(x) # output -> '\u062f\u0631 \u0627\u0628\u0644' print(type(x)) # output -> <class 'str'> 

how can i convert x in utf8 ?

1 Answer 1

1

Use .encode('raw_unicode_escape').decode('unicode_escape') for doubled Reverse Solidi, see Python Specific Encodings

x= '\\u062f\\u0631 \\u0627\\u0628\\u0644' print(x, '->', x.encode('raw_unicode_escape').decode('unicode_escape')) 
\u062f\u0631 \u0627\u0628\u0644 -> در ابل 
Sign up to request clarification or add additional context in comments.

4 Comments

x.encode('ascii').decode('unicode_escape') is sufficient. There's nothing to escape in the original string.
@MarkTolonen you are right (for this particular string). However, your solution fails if x contains a non-ascii character, e.g. x= '"در ابل" is the same as "\\u062f\\u0631 \\u0627\\u0628\\u0644"'. So I'm on safer side…
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 1-2: surrogates not allowed .................. its not working :(
@mehdinora surrogates are Unicode range from U+D800 to U+DFFF. I can't see any such codepoint(s) in your or my minimal reproducible example.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.