I am trying to get the span of selected words in a string. When working with the İ character, I noticed the following behavior of Python:
len("İ") Out[39]: 1 len("İ".lower()) Out[40]: 2 # when `upper()` is applied, the length stays the same len("İ".lower().upper()) Out[41]: 2 Why does the length of the upper and lowercase value of the same character differ (that seems very confusing/undesired to me)?
Does anyone know if there are other characters for which that will happen? Thank you!
EDIT:
On the other hand for e.g. Î the length stays the same:
len('Î') Out[42]: 1 len('Î'.lower()) Out[43]: 1