1

I have a string contains Unicode characters and I want to convert it to UTF-8 in python.

s = '\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a' 

I want convert s to UTF format.

1

2 Answers 2

1

Add u as prefix for the string s then encode it in utf-8.

Your code will look like this:

s = u'\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a' s_encoded = s.encode('utf-8') print(s_encoded) 

I hope this helps.

Sign up to request clarification or add additional context in comments.

1 Comment

If the OP is using Python 3 (it seems so), then the u prefix isn't necessary. But the .encode('utf8') is definitely right.
0

Add the below line in the top of your .py file.

# -*- coding: utf-8 -*- 

It allows you to encode strings directly in your python script, like this:

# -*- coding: utf-8 -*- s = '\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a' print(s) 

Output :

بیسکویت 

2 Comments

The source encoding declaration doesn't really apply here, because the string is entered with ASCII-only characters. It would be different if the string literal was actually composed of Arabic letters (not escape sequences).
A coding line declares the encoding of the source file only. If you have only ASCII characters in the source (as above) it does nothing. In fact, in Python 3, UTF-8 is the default source encoding if undeclared.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.