20

Could you please give me an example of writing a custom gcc preprocessor?

My goal is to replace SID("foo") alike macros with appropriate CRC32 computed values. For any other macro I'd like to use the standard cpp preprocessor.

It looks like it's possible to achieve this goal using -no-integrated-cpp -B options, however I can't find any simple example of their usage.

4
  • 5
    This is not answering your question, but have you considered writing a support script that takes your template file (e.g. example.c.tmpl), computes the CRC and substitutes into an output file (e.g. example.c) as part of your make process? Commented Aug 23, 2010 at 8:36
  • 1
    In my situation this is inconvenient since I don't have a special template, these SID macro can be anywhere in the source code. Of course, I can add a custom Make target which processes all *.cpp sources with a simple script changing SID macro and then pass the mangled sources to gcc....but I think using a custom preprocessor could be more elegant. Commented Aug 23, 2010 at 11:13
  • Have you tried standard way : #undef SID #define SID ? Commented Aug 28, 2010 at 7:29
  • 2
    I think you should re-read the initial post more carefully. This problem can not be solved with the standard preprocessor. Commented Sep 7, 2010 at 8:03

2 Answers 2

35
+50

Warning: dangerous and ugly hack. Close your eyes now You can hook your own preprocessor by adding the '-no-integrated-cpp' and '-B' switches to the gcc command line. '-no-integrated-cpp' means that gcc does search in the '-B' path for its preprocessors before it uses its internal search path. The invocations of the preprocessor can be identified if the 'cc1', 'cc1plus' or 'cc1obj' programs (these are the C, C++ and Objective-c compilers) are invoked with the '-E' option. You can do your own preprocessing when you see this option. When there is no '-E' option pass all the parameters to the original programs. When there is such an option, you can do your own preprocessing, and pass the manipulated file to the original compiler.

It looks like this:

> cat cc1 #!/bin/sh echo "My own special preprocessor -- $@" /usr/lib/gcc/i486-linux-gnu/4.3/cc1 $@ exit $? > chmod 755 cc1 > gcc -no-integrated-cpp -B$PWD x.c My own special preprocessor -- -E -quiet x.c -mtune=generic -o /tmp/cc68tIbc.i My own special preprocessor -- -fpreprocessed /tmp/cc68tIbc.i -quiet -dumpbase x.c -mtune=generic -auxbase x -o /tmp/cc0WGHdh.s 

This example calls the original preprocessor, but prints an additional message and the parameters. You can replace the script by your own preprocessor.

The bad hack is over. You can open your eyes now.

Sign up to request clarification or add additional context in comments.

5 Comments

Hm...correct me if I'm wrong but I thought using this approach I could replace SID macros, save the result to some temp file and then apply the standard preprocessor to this temp file. No?
@pachanga Yes, you need to extract the command line options for the input and output files, and write a second tempfile for the output of your processor (I believe you need to preserve the file extension). Then you pass the processed file as input file to THE ORIGINAL(TM) preprocessor by patching the input file parameter. But leave all other parameters the way they were, since some of them are position depend (like -I, -D or -U). After THE ORIGINAL(TM) preprocessor is done you clean up your tempfile and leave with the exit code of THE ORIGINAL(TM) preprocessor.
Slight improvement: you can automatically find which corresponding preprocessor would have run like this: g++ --print-prog-name=cc1plus So your filter becomes: #!/bin/sh echo "Own special preprocessor; args = $@" $(g++ --print-prog-name=cc1plus) $@ exit $?
Even better would be exec $(${COLLECT_GCC} --print-prog-name=cc1) $@
This seems like the standard way to perform this task, and not an ugly hack. It also seems like a great hook for doing code gen. The little-known C pre-preprocessor...
8

One way is to use a program transformation system, to "rewrite" just the SID macro invocation to what you want before you do the compilation, leaving the rest of the preprocessor handling to the compiler itself.

Our DMS Software Reengineering Toolkit is a such a system, that can be applied to many languages including C and specifically the GCC 2/3/4 series of compilers.

To implement this idea using DMS, you would run DMS with its C front end over your source code before the compilation step. DMS can parse the code without expanding the preprocessor directives, build abstract syntax trees representing it, carry out transformations on the ASTs, and then spit out result as compilable C text.

The specific transformation rule you would use is:

rule replace_SID_invocation(s:STRING):expression->expression = "SID(\s)" -> ComputeCRC32(s); 

where ComputeCRC32 is custom code that does what it says. (DMS includes a CRC32 implementation, so the custom code for this is pretty short.

DMS is kind a a big hammer for this task. You could use PERL to implement something pretty similar. The difference with PERL (or some other string match/replace hack) is the risk that a) it might find the pattern someplace where you don't want a replacement, e.g.

 ... QSID("foo")... // this isn't a SID invocation 

which you can probably fix by coding your pattern match carefully, b) fail to match a SID call found in suprising circumstances:

 ... SID ( /* master login id */ "Joel" ) ... // need to account for formatting and whitespace 

and c) fail to handle the various kinds of escape characters that show up in the literal string itself:

 ... SID("f\no\072") ... // need to handle all of GCC's weird escapes 

DMS's C front end handles all the escapes for you; the ComputeCRC32 function above would see the string containing the actual intended characters, not the raw text you see in the source code.

So its really a matter of whether you care about the dark-corner cases, or if you think you may have more special processing to do.

Given the way you've described the problem, I'd be sorely tempted to go the Perl route first and simply outlaw the funny cases. If you can't do this, then the big hammer makes sense.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.