4

I have been writing in C++ since 2010.

I’ve just accidentally inputted the “й“ letter in my code, hovered the mouse on it to remove it, and noticed that Visual Studio just says there is no variable “й“.

I wrote int й = 1; and it just compiled!

What did I miss?


It bet it’s probably features of C++11, C++14 or something like this.

2
  • 1
    No, that's visual studio accepting special characters. GCC doesn't even accept é, nevermind й. Commented Mar 28, 2020 at 12:00
  • Gcc has poor support for non ascii characters. Commented Mar 28, 2020 at 12:09

3 Answers 3

3

Here's what The Standard says ([lex.phases]):

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. The set of physical source file characters accepted is implementation-defined.

So your particular implementation supports that, but it's not guaranteed to be portable to any other implementation.

Sign up to request clarification or add additional context in comments.

Comments

2

If you look at Annex E of this paper, you can see that there are certain Unicode ranges allowed to be variable names. These ranges include:

00A8, 00AA, 00AD, 00AF, 00B2-00B5, 00B7-00BA, 00BC-00BE, 00C0-00D6, 00D8-00F6, 00F8-00FF 0100-167F, 1681-180D, 180F-1FFF 200B-200D, 202A-202E, 203F-2040, 2054, 2060-206F 2070-218F, 2460-24FF, 2776-2793, 2C00-2DFF, 2E80-2FFF 3004-3007, 3021-302F, 3031-303F 3040-D7FF F900-FD3D, FD40-FDCF, FDF0-FE44, FE47-FFFD 10000-1FFFD, 20000-2FFFD, 30000-3FFFD, 40000-4FFFD, 50000-5FFFD, 60000-6FFFD, 70000-7FFFD, 80000-8FFFD, 90000-9FFFD, A0000-AFFFD, B0000-BFFFD, C0000-CFFFD, D0000-DFFFD, E0000-EFFFD

1 Comment

What characterises these ranges? Main alphabets (if that is a thing)?
0

Well, it seems there isn't any restriction on the Unicode characters using to define an identifier according to MSDN:

struct テスト // Japanese 'test' { void トスト() {} // Japanese 'toast' }; int main() { テスト \u30D1\u30F3; // Japanese パン 'bread' in UCN form パン.トスト(); // Compiler recognizes UCN or literal form } 

I am disappointed that cplusplus.com doesn't have any word about this.

3 Comments

Don't use cplusplus.com, use cppreference.com
cplusplus.com doesn't exactly have a good reputation for accuracy or completeness.
@Shawn many of russian-based sites say that it should be exceptionally latin at all.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.