How to Turn string into bytes?

Question

Using python3 and I've got a string which displayed as bytes

strategyName=\xe7\x99\xbe\xe5\xba\xa6

I need to change it into readable chinese letter through decode

orig=b'strategyName=\xe7\x99\xbe\xe5\xba\xa6' result=orig.decode('UTF-8') print()

which shows like this and it is what I want

strategyName=百度

But if I save it in another string,it works different

str0='strategyName=\xe7\x99\xbe\xe5\xba\xa6' result_byte=str0.encode('UTF-8') result_str=result_byte.decode('UTF-8') print(result_str)

strategyName=ç¾åº¦é£é©çç¥

Please help me about why this happening,and how can I fix it.
Thanks a lot

You have a typo: orig is a bytes, while str0 is a str. Add a b in front of the data for str0 and decode it. — Mad Physicist
– Mad Physicist, Commented Jan 11, 2019 at 3:26
Put it another way result_byte != orig because the individual bytes in orig are combined to produce the Unicode characters, but each escape sequence in a string is a separate character already. — Mad Physicist
– Mad Physicist, Commented Jan 11, 2019 at 3:28

ShadowRanger · Accepted Answer · 2019-01-11 03:30:45Z

Your problem is using a str literal when you're trying to store the UTF-8 encoded bytes of your string. You should just use the bytes literal, but if that str form is necessary, the correct approach is to encode in latin-1 (which is a 1-1 converter for all ordinals below 256 to the matching byte value) to get the bytes with utf-8 encoded data, then decode as utf-8:

str0 = 'strategyName=\xe7\x99\xbe\xe5\xba\xa6' result_byte = str0.encode('latin-1') # Only changed line result_str = result_byte.decode('UTF-8') print(result_str)

Of course, the other approach could be to just type the Unicode escapes you wanted in the first place instead of byte level escapes that correspond to a UTF-8 encoding:

result_str = 'strategyName=\u767e\u5ea6'

No rigmarole needed.

Collectives™ on Stack Overflow

How to Turn string into bytes?

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related