Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

9
  • 2
    Juan : Do you mean that std::string can hold all unicode characters but the length will report incorrectly? Is there a reason that it is reporting incorrect length? Commented Dec 31, 2008 at 4:35
  • 4
    When using the utf-8 encoding, a single unicode character may be made up of multiple bytes. This is why utf-8 encoding is smaller when using mostly characters from the standard ascii set. You need to use special functions (or roll your own) to measure the number of unicode characters. Commented Dec 31, 2008 at 4:39
  • 2
    (Windows specific) Most functions will expect that a string using bytes is ASCII and 2 bytes is Unicode, older versions MBCS. Which means if you are storing 8 bit unicode that you will have to convert to 16 bit unicode to call a standard windows function (unless you are only using ASCII portion). Commented Dec 31, 2008 at 4:58
  • 3
    Not only will a std::string report the length incorrectly, but it will also output the wrong string. If some Unicode character is represented in UTF-8 as multiple bytes, which std::string thinks of as its own characters, then your typically std::string manipulation routines will probably output the several strange characters that result from the misinterpretation of the one correct character. Commented Dec 15, 2013 at 17:01
  • 2
    I suggest changing the answer to indicate that strings should be thought of as only containers of bytes, and, if the bytes are some Unicode encoding (UTF-8, UTF-16, ...), then you should use specific libraries that understand that. The standard string-based APIs (length, substr, etc.) will all fail miserably with multibyte characters. If this update is made, I will remove my downvote. Commented Oct 7, 2014 at 14:19