Revisions to Compare std::wstring and std::string

added a note about ICU

edited Aug 21, 2011 at 22:56

14.1k
3
51
65

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.

The cross-platform way to work with Unicode is to use the ICU library.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.

The cross-platform way to work with Unicode is to use the ICU library.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

word choice

Source Link

edited Aug 21, 2011 at 21:53

Don Reba

14.1k
3
51
65

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

added code sample

Source Link

edited Aug 21, 2011 at 21:46

Don Reba

14.1k
3
51
65

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.

If you are not using Windows, then the analogous functions are mbstowcs and wcscmp.

Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.

wstring ConvertToUnicode(const string & str) { UINT codePage = CP_ACP; DWORD flags = 0; int resultSize = MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , NULL // lpWideCharStr , 0 // cchWideChar ); vector<wchar_t> result(resultSize + 1); MultiByteToWideChar ( codePage // CodePage , flags // dwFlags , str.c_str() // lpMultiByteStr , str.length() // cbMultiByte , &result[0] // lpWideCharStr , resultSize // cchWideChar ); return &result[0]; }

added a comparison disclaimer

Source Link

edited Aug 21, 2011 at 21:35

Don Reba

14.1k
3
51
65

Loading

Source Link

answered Aug 21, 2011 at 21:27

Don Reba

14.1k
3
51
65

Loading

Collectives™ on Stack Overflow

Return to Answer