54

Are C-style macro names subject to the same naming rules as identifiers? After a compiler upgrade, it is now emitting this warning for a legacy application:

warning #3649-D: white space is required between the macro name "CHAR_" and its replacement text #define CHAR_& 38 

This line of code is defining an ASCII value constant for an ampersand.

#define DOL_SN 36 #define PERCENT 37 #define CHAR_& 38 #define RT_SING 39 #define LF_PAR 40 

I assume that this definition (not actually referenced by any code, as far as I can tell) is buggy and should be changed to something like "CHAR_AMPERSAND"?

1
  • 4
    For anyone stumbling upon this: You can get ASCII values of characters in C by using single quotes: '$', '%', '&', obviously things like '&'+1 and '&'+2 work as well (: Commented Dec 10, 2020 at 20:22

4 Answers 4

48

Macro names should only consist of alphanumeric characters and underscores, i.e. 'a-z', 'A-Z', '0-9', and '_', and the first character should not be a digit. Some preprocessors also permit the dollar sign character '$', but you shouldn't use it; unfortunately I can't quote the C standard since I don't have a copy of it.

From the GCC documentation:

Preprocessing tokens fall into five broad classes: identifiers, preprocessing numbers, string literals, punctuators, and other. An identifier is the same as an identifier in C: any sequence of letters, digits, or underscores, which begins with a letter or underscore. Keywords of C have no significance to the preprocessor; they are ordinary identifiers. You can define a macro whose name is a keyword, for instance. The only identifier which can be considered a preprocessing keyword is defined. See Defined.

This is mostly true of other languages which use the C preprocessor. However, a few of the keywords of C++ are significant even in the preprocessor. See C++ Named Operators.

In the 1999 C standard, identifiers may contain letters which are not part of the “basic source character set”, at the implementation's discretion (such as accented Latin letters, Greek letters, or Chinese ideograms). This may be done with an extended character set, or the '\u' and '\U' escape sequences. The implementation of this feature in GCC is experimental; such characters are only accepted in the '\u' and '\U' forms and only if -fextended-identifiers is used.

As an extension, GCC treats '$' as a letter. This is for compatibility with some systems, such as VMS, where '$' is commonly used in system-defined function and object names. '$' is not a letter in strictly conforming mode, or if you specify the -$ option. See Invocation.

Sign up to request clarification or add additional context in comments.

1 Comment

ISO/IEC 9899:TC3 6.10/1 specifies the grammar of preprocessing directives, making explicit that macro names share the exact same rules as identifiers. 6.4.2.1/1 specifies the rules for identifiers.
20

clang allows a lot of "crazy" characters.. although I have struggled to find any much rhyme or reason - as to why some are allowed, and others are not. For example..

#define 💩 ?: /// WORKS FINE #define ■ @end /// WORKS FINE #define 🅺 @interface /// WORKS FINE #define P @protocol /// WORKS FINE 

yet

#define ☎ TEL /// ERROR: Macro name must be an identifier. #define ❌ NO /// ERROR: Macro name must be an identifier. #define ⇧ UP /// ERROR: Macro name must be an identifier. #define 〓 == /// ERROR: Macro name must be an identifier. #define 🍎 APPLE /// ERROR: Macro name must be an identifier. 

Who knows. I'd love to... but Google has thus failed me, so far. Any insight on the subject, would be appreciated™️.

2 Comments

Have a look into clang documentation. Or if it isn't even documented, check its sourcecode for it since its opensource. But By C standard I can tell you all of them are invalid. and therefor, since its not asked for any enviroment related req's, but for plain C. So this is even more fitting as a seperated Question as an answer and probably it should even be flagged as such. ... Even a pretty good one. I'm gonna ask this if you are not going to.
8

You're right, the same rules apply to macro and identifiers as far as the names are concerned: valid characters are [A-Za-z0-9_].

It's common usage to use CAPITALIZED names to differentiate macros from other identifiers - variables and function name.

Comments

1

The same rules that specify valid identifiers for variable names apply to macro names with the exception that macros may have the same names as keywords. Valid characters in identifier names include digits and non-digits and must not start with a digit. non-digits include the uppercase letters A-Z, the lowercase letters a-z, the underscore, and any implementation defined characters.

1 Comment

"with the exception that macros may have the same names as keywords" -- yes, but also with more exceptions that macros may not have the same name as this preprocessor keyword: defined

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.