6

A continuation on C++ and UTF8 - Why not just replace ASCII?

Why is there no std::ustring which could replace both std::string, std::wstring in new applications?

Of course with corresponding support in the standard library. Similarly to how boost::filesystem3::path doesn't care about string representation and works with both std::string and std::wstring.

2 Answers 2

4

Why would you replace anything?

string and wstring are the string classes corresponding to char and wchar_t, which in the context of interfacing with the environment are meant to carry data encoded in, respectively, "the system's narrow-multibyte representation" and fixed-width in "the system's encoding".

On the other hand, u8/u/U, as well as char16_t and char32_t, as well as the corresponding string classes, are intended for the storage of Unicode codepoint sequences encoded in UTF-8/16/32.

The latter is a separate problem domain from the former. The standard doesn't contain a mechanism to bridge the two domains (and a library such as iconv() is typically required to make this bridge portable, e.g. by transcoding WCHAR_T/UTF-32).

Here's my standard list of related questions: #1, #2, #3

Sign up to request clarification or add additional context in comments.

Comments

2

There's std::u16string and std::u32string. Standard libraries where you might want to use these, e.g. to name a file to open with fstream, aren't going to be changed to use these because they really can't. For example some platforms take an almost arbitrary byte string to name a file to open, with no specified encoding. Having to run that through a string with a specific encoding would break things and be incompatible.

1 Comment

These are available only in C++11/C++0x, which not all compilers or OSes fully support yet.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.