0

Is it possible to convert a string to unicode characters in Python?

For example: Hello => \u0048\u0045\u004C\u004C\u004F

1 Answer 1

1

You can't. A string already consists of 'Unicode characters'. What you want to change is not its representation but its actual contents.

That, fortunately, is as simple as

import re text = 'Hello' print (re.sub('.', lambda x: r'\u%04X' % ord(x.group()), text)) 

which outputs

\u0048\u0065\u006C\u006C\u006F 
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you! What is this conversion called? I've been googling for almost an hour now. I would assume this has already been asked and been answered somewhere.
It's a regular expression that replaces each character with its own Unicode string through a custom replacement function. But there are several other approaches possible – this is only one. The basic idea is that each character gets replaced with another string.
I know it's a regular expression and it's using a custom replacement function, I was just asking about converting character H to \u0048..
print(''.join(r'\u{:04X}'.format(ord(char)) for char in 'Hello'))
@furas I like that version too, looks pythonic. :)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.