There are two issues at play here. The first is what characters are allowed in C++ code (and comments), such as variable names. The second is what characters are allowed in strings and string literals.
As noted, C++ compilers must support a very restricted ASCII-based character set for the characters allowed in code and comments. In practice, this character set didn't work very well with some European character sets (and especially with some European keyboards that didn't have a few characters -- like square brackets -- available), so the concept of digraphs and trigraphs was introduced. Many compilers accept more than this character set at this time, but there isn't any guarantee.
As for strings and string literals, C++ has the concept of a wide character and wide character string. However, the encoding for that character set is undefined. In practice it's almost always Unicode, but I don't think there's any guarantee here. Wide character string literals look like L"string literal", and these can be assigned to std::wstring's.
C++11 added explicit support for Unicode strings and string literals, encoded as UTF-8, UTF-16 big endian, UTF-16 little endian, UTF-32 big endian and UTF-32 little endian.
µshow asµin my logs. I suspected GNU g++ assumed iso-8859-1 source and over-encoded the one-character two-byte sequence in the binary. Actually it understood source was UTF-8 based on locale. Log contained the correct two-byte sequence. Fact is, another part of the log contained stray bytes which introduced non-UTF-8 conformant byte sequences in the file. So, editor emacs figured out the file was most certainly actually ISO-8859-1 and showed two-byte characters as two separate characters. Fixing those stray bytes fixed the problem.