Revisions to What is the difference between "UTF-16" and "std::wstring"?

Improved grammar and punctuation. Added links to 'UTF-16', 'BMP' and 'surrogate pairs'.

edited Apr 27, 2015 at 13:45

19.8k
27
114
202

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windowsspecified—Windows compilers tend to use a 16 bit-bit type, Unix compilers a 32 bit-bit type.

UTF-16UTF-16 is a way of encoding sequences of unicodeUnicode code points in sequences of 16 bit-bit integers.

IfUsing Visual Studio, if you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual StudioBMP, you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstringstd::wstring will not translate surrogate pairssurrogate pairs into unicodeUnicode code points for you, even if wchar_t is 16 bits.

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windows compilers tend to use a 16 bit type, Unix compilers a 32 bit type.

UTF-16 is a way of encoding sequences of unicode code points in sequences of 16 bit integers.

If you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual Studio you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into unicode code points for you, even if wchar_t is 16 bits.

std::wstring is a container of wchar_t. The size of wchar_t is not specified—Windows compilers tend to use a 16-bit type, Unix compilers a 32-bit type.

UTF-16 is a way of encoding sequences of Unicode code points in sequences of 16-bit integers.

Using Visual Studio, if you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP, you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into Unicode code points for you, even if wchar_t is 16 bits.

edited body

Source Link

edited Nov 23, 2010 at 12:22

JoeG

13.2k
1
40
64

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windows compilers tend to use a 16 bit type, Unix compilers a 32 bit type.

UTF-16 is a way of encoding sequences of unicode code points in sequences of 16 bit integers.

If you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual Studio you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into unicode code poitnspoints for you, even if wchar_t is 16 bits.

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windows compilers tend to use a 16 bit type, Unix compilers a 32 bit type.

UTF-16 is a way of encoding sequences of unicode code points in sequences of 16 bit integers.

If you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual Studio you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into unicode code poitns for you, even if wchar_t is 16 bits.

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windows compilers tend to use a 16 bit type, Unix compilers a 32 bit type.

UTF-16 is a way of encoding sequences of unicode code points in sequences of 16 bit integers.

If you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual Studio you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into unicode code points for you, even if wchar_t is 16 bits.

Source Link

answered Nov 22, 2010 at 15:50

JoeG

13.2k
1
40
64

std::wstring is a container of wchar_t. The size of wchar_t is not specified - Windows compilers tend to use a 16 bit type, Unix compilers a 32 bit type.

UTF-16 is a way of encoding sequences of unicode code points in sequences of 16 bit integers.

If you use wide character literals (e.g. L"Hello World") that contain no characters outside of the BMP in Visual Studio you'll end up with UTF-16, but mostly the two concepts are unrelated. If you use characters outside the BMP, std::wstring will not translate surrogate pairs into unicode code poitns for you, even if wchar_t is 16 bits.

Collectives™ on Stack Overflow

Return to Answer