1

The objective is to print Uni Würzburg using C++

The code I am using:

#include <stdio.h> using namespace std; int main() { char str0[21] = "Uni Würzburg"; printf("%s\n", str0); char str1[21] = {85,110,105,32,87,'\xc3','\xbc',114,122,98,117,114,103, 0}; printf("%s\n", str1); char str2[20] = "Uni W\x81rzburg"; printf("%s\n", str2); char str3[20] = {85,110,105,32,87,'\x81',114,122,98,117,114,103, 0}; printf("%s\n", str3); return 0; } 

I got the \xc3bc from creating a "ü" string and printing the characters.

Output on two different Macs (using both CLion and in bash using g++ test.c -o test):

Uni Würzburg Uni Würzburg Uni W�rzburg Uni W�rzburg 

Output on Windows (CLion):

Uni W├╝rzburg Uni W├╝rzburg Uni Würzburg Uni Würzburg 

CLion editor and project encodings are in all cases set to UTF-8 and the locale of bash is:

LANG="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_CTYPE="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_ALL= 

Why is this happening? Most importantly: What would be a platform independent solution?

1

1 Answer 1

3

There unicode literals that can be used to ensure that your string is encoded as UTF-8:

u8"my_string"

On Linux these your normal strings will be already UTF-8.

On Windows it is really depending on your codpeage. And you may also supply additional compiler flag: /source-charset:utf-8

Note that even if your strings are encoded as UTF-8, cout, on Windows, that prints to console with non-unicode codepage will get you wrong output.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.