How to transform a string into a Unicode character

Question

I would like to create a pretty simple code to get multiple string inputs and show as Unicode characters, let's say for example:

2119 01b4 2602 210c 00f8 1f24 (This should show 'Python' with some symbols)

But I keep getting the following exception:

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

I'm trying to use '\u' to keep it simple, but if there's no other way to do this, I wouldn't bother.

My code:

while True: string = input() print(f'\u{string}', end='')

I searched and found something in Swift which is exactly what I want to do in Python, but I didn't quite understand that: Print unicode character from variable (swift).

'\u0000' is part of Python's literal syntax. You can't use substitutions to create syntax any more than you could run, say, value = ''' ' + str(something) + ' ''', and then expect f'{value}' to call str(something); if it did work, it would imply serious security bugs. — Charles Duffy
– Charles Duffy, Commented Nov 30, 2020 at 18:20

Charles Duffy · Accepted Answer · 2020-11-30 18:26:17Z

Assuming that you don't really care about whether the \u syntax is used, this would look like:

while True: string = input() print(chr(int(string, 16)), end='')

If you do in fact care for some reason:

while True: string = input() print((br'\u' + string.encode('utf-8')).decode('unicode_escape'), end='')

tdelaney · Accepted Answer · 2020-11-30 18:59:22Z

The problem is that the unicode escape takes precedence over the f-string format specification. It sees "\u{str" as a 4 character escape sequence. You can split this in to two steps: create the escape and then decode. Since unicode characters can exceed 4 bytes, you may as well go large.

>>> import codecs >>> string = "2119 01b4 2602 210c 00f8 1f24" >>> for s in string.split(" "): ... print(codecs.decode(rf"\U{s.zfill(8)}", "unicode-escape"), end="") ... ℙƴ☂ℌøἤ

Mark Tolonen · Accepted Answer · 2020-12-01 00:21:09Z

You can't directly construct \uxxxx escape sequences since that is a language construct, but it is more straightforward to use chr to convert Unicode ordinals to characters. Also int(s,16) will convert a hexadecimal string to an integer:

>>> print(''.join(chr(int(x,16)) for x in input().split())) 2119 01b4 2602 210c 00f8 1f24 ℙƴ☂ℌøἤ

Collectives™ on Stack Overflow

How to transform a string into a Unicode character

3 Answers 3

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Linked

Related