Replacing a list of utf8 characters in Python

Question

I have a list of Farsi characters (utf8) like this:

my_list = ['ﯾ', 'ﻲ', 'ﻴ']

And I would like to replace every occurrence of characters in this list in a text with another character like 'a'. Right now my code is like this:

text = text.replace('ﻴ', 'a') text = text.replace('ﻲ', 'a') text = text.replace('ﯾ', 'a')

Is there any way to do this in one shot, for example using Regex and lists in Python 3?

I think they are, but even if they are not, do you know how to do what I want? — TJ1
– TJ1, Commented Dec 8, 2014 at 5:21
You are showing a gylph (a visual representation of a character). UTF-8 is an encoding (it translates "numbers" (called code points) to letters). So what you have is a string, not "utf-8". — Burhan Khalid
– Burhan Khalid, Commented Dec 8, 2014 at 5:22

Ignacio Vazquez-Abrams · Accepted Answer · 2014-12-08 05:22:54Z

5

3>> 'ﻴ ﻲ ﯾ'.translate({0xfbfe: 'a', 0xfef2: 'a', 0xfef4: 'a'}) 'a a a'

str.translate()

answered Dec 8, 2014 at 5:22

Ignacio Vazquez-Abrams

804k160 gold badges1.4k silver badges1.4k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Ignacio Vazquez-Abrams Over a year ago

Those are the code points of the characters.

TJ1 Over a year ago

Thanks, how can I get code points of characters in Python like what you got from my characters?

Ignacio Vazquez-Abrams Over a year ago

3>> hex(ord('.')) '0xXXXX' Replace the "." with the actual character.

Loocid · Accepted Answer · 2014-12-08 05:20:50Z

0

You could use a for loop instead:

for char in my_list: text = text.replace(char,'a')

This way your list could be an arbitrary length but the size of your code wont change. Plus you aren't repeating code.

answered Dec 8, 2014 at 5:20

Loocid

6,5111 gold badge30 silver badges46 bronze badges

1 Comment

TJ1 Over a year ago

Thanks for the answer, but I do not want to use for loop. I know that in Python using re it is possible to do something like I want, but I don't know how to do it with UTF-8 characters like what I have.

Collectives™ on Stack Overflow

Replacing a list of utf8 characters in Python

2 Answers 2

3 Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Related