4

Consider this:

s = u"おはよう" print len(s) for c in s: print c 

The output is

4 お は よ う 

which is what I expect

Now with emojis:

s = u"hi 🏈" 

Output is

5 h i ???? ???? 

Why is that? How can I fix it? I have looked at various links before but can't get my head around it Ideally I would like a solution that works both for japanese AND emoticons but if it is for ascii and emoticons only Im fine with it too

10
  • 2
    might be a version issue. works fine in python 3.5 Commented Feb 8, 2017 at 12:44
  • 3
    It sounds like you have a narrow build. Please see Python returns length of 2 for single Unicode character string for more info. Commented Feb 8, 2017 at 12:48
  • 1
    Anyoway, thh advice is to upgrade to use Python 3.5 or 3.6 - there is no need to use an ancient version as Python 2.7 for this kind of work, and doubly so if you keep in mind that easier working with unicode is one of the strenghts of Python3.x series Commented Feb 8, 2017 at 12:52
  • 5
    I have installed python 3.x and it works fine. took me for ever to find a good reason to do the switch. Thanks guys Commented Feb 8, 2017 at 12:54
  • 2
    Well done, Thomas! It'll take you a little while to get used to the differences, but once you do, you'll wonder how you ever tolerated Python 2's string / Unicode madness. :) Commented Feb 8, 2017 at 12:57

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.