-1

The problem is the following.

'Β'=='B' Out[104]: False 

To make things clear the first is a Greek 'Β' and the second a Latin 'B'.

For sure python is correct to give False as output but for the purpose of the script I'm working I need such characters to count as the same. Tried several encoding /decoding manipulations but still count as different. Any Ideas?

5
  • how did python gave you the result False can you edit your question to say how did you run this line. Commented Oct 6, 2020 at 23:23
  • it is just typed as you see it. The fist ''B'' is typed with English in my keyboard and then switch to Greek keyboard for the second ''B'' Commented Oct 6, 2020 at 23:26
  • Are you only checking letters, or do you need to literally translate words and check for a match? Commented Oct 6, 2020 at 23:30
  • ok then you will have to include extra logic in your program since these two have different unicode values which you can't change instead you can use if statements to check in this case. Commented Oct 6, 2020 at 23:30
  • I try to compare vehicle numbers. The Greek ones exist in a dataframe column. The other side come from selenium reading an html table. Those are brought from selenium with Latin characters. Can I change the way selenium reads the table? Commented Oct 7, 2020 at 1:08

1 Answer 1

0

Following this other answer,

data="UTF-8 DATA" udata=data.decode("utf-8") asciidata=udata.encode("ascii","ignore") 

This will make you loose data as you are going from a 8-bit encoding to a 7-bit (as stated by a comment from the very same answer I am citing), and might work for your problem.

Good luck!

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.