How to convert UTF-16 to UTF-8 using C++?

Question

I already know 'codecvt', 'WideCharToMultiByte', and someone.

I use korean language. For example. '안녕하세요'.

It message can insert normal string class. right?

But in my case. If i have file :: 'test.txt' {in :: '안녕하세요'}

And read 'test.txt', and getline(),

(test.txt file read) string temp; getline(file pointer, temp); cout<<temp;

Now i use cout. Ta-Da! message are broken!

I know that is WideCharacter problem. so i tring MultiByteToWideChar method.

Ok. It is work well.

But i not want this.

Finally I want reading widecharcter files, and save 'string' Variable.

So, I question for you.

How to convert UTF-16 (widecharcter/wstring) to UTF-8 (multibyte/string) when 'Not change message' ?

:: I want this style

wstring temp = "안녕하세요"

string temp2 = convert_to_string(temp);

->

string temp2 = "안녕하세요"

Not exactly a duplicate but this answer may be what you want? stackoverflow.com/questions/52703630/… — Galik
– Galik, Commented Dec 14, 2018 at 14:53

Community · Accepted Answer · 2021-10-07 11:02:37Z

As mentioned in the comment, you can see Convert C++ std::string to UTF-16-LE encoded string for the code on how to do the conversion.

But given you assumed you have wstring to hold your Korean string, you avoided the trouble of distinguishing UTF-16-LE and UTF-16-BE and you can readily find the Unicode code point of each Korean character in the string. So your problem boils down to find the UTF-8 representation of any code point. It would not be hard, see page 3 of https://www.rfc-editor.org/rfc/rfc3629 (also Wikipedia https://en.wikipedia.org/wiki/UTF-8).

A sample code is in Convert Unicode code points to UTF-8 and UTF-32

Collectives™ on Stack Overflow

How to convert UTF-16 to UTF-8 using C++?

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related