0

I have a list of Farsi characters (utf8) like this:

my_list = ['ﯾ', 'ﻲ', 'ﻴ'] 

And I would like to replace every occurrence of characters in this list in a text with another character like 'a'. Right now my code is like this:

text = text.replace('ﻴ', 'a') text = text.replace('ﻲ', 'a') text = text.replace('ﯾ', 'a') 

Is there any way to do this in one shot, for example using Regex and lists in Python 3?

3
  • 1
    That's not UTF-8, at least not in Python 3. Commented Dec 8, 2014 at 5:19
  • I think they are, but even if they are not, do you know how to do what I want? Commented Dec 8, 2014 at 5:21
  • You are showing a gylph (a visual representation of a character). UTF-8 is an encoding (it translates "numbers" (called code points) to letters). So what you have is a string, not "utf-8". Commented Dec 8, 2014 at 5:22

2 Answers 2

5
3>> 'ﻴ ﻲ ﯾ'.translate({0xfbfe: 'a', 0xfef2: 'a', 0xfef4: 'a'}) 'a a a' 

str.translate()

Sign up to request clarification or add additional context in comments.

3 Comments

Those are the code points of the characters.
Thanks, how can I get code points of characters in Python like what you got from my characters?
3>> hex(ord('.')) '0xXXXX' Replace the "." with the actual character.
0

You could use a for loop instead:

for char in my_list: text = text.replace(char,'a') 

This way your list could be an arbitrary length but the size of your code wont change. Plus you aren't repeating code.

1 Comment

Thanks for the answer, but I do not want to use for loop. I know that in Python using re it is possible to do something like I want, but I don't know how to do it with UTF-8 characters like what I have.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.