1

I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language. The generated parser can, in theory, be compiled as C++, but I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser:

/* expr.h */ enum Kind { PLUS, MINUS }; class Expr { // stub }; class ExprFactory { public: Expr mkExpr(Kind kind, Expr op1, Expr op2); Expr mkInt(std::string n); }; 

And here's a simple parser definition:

/* Expr.g */ grammar Expr; options { language = 'C'; } @parser::includes { #include "expr.h" } @members { ExprFactory *exprFactory; } start returns [Expr expr] : e = expression EOF { $expr = e; } ; expression returns [Expr e] : TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN { e = exprFactory->mkExpr(k,op1,op2); } | INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); } ; builtinOp returns [Kind kind] : TOK_PLUS { kind = PLUS; } | TOK_MINUS { kind = MINUS; } ; TOK_PLUS : '+'; TOK_MINUS : '-'; TOK_LPAREN : '('; TOK_RPAREN : ')'; INTEGER : ('0'..'9')+; 

The grammar runs through ANTLR just fine. When I try to compile ExprParser.c, I get errors like

  1. conversion from ‘long int’ to non-scalar type ‘Expr’ requested
  2. no match for ‘operator=’ in ‘e = 0l’
  3. invalid conversion from ‘long int’ to ‘Kind’

In each case, the statement is an initialization of an Expr or Kind value to NULL.

I can make the problem go away for the Expr's by changing everything to Expr*. This is workable, though hardly ideal. But passing around pointers for a simple enum like Kind seems ridiculous. One ugly workaround I've found is to create a second return value, which pushes the Kind value into a struct and suppresses the initialization to NULL. I.e, builtinOp becomes

builtinOp returns [Kind kind, bool dummy] : TOK_PLUS { $kind = PLUS; } | TOK_MINUS { $kind = MINUS; } ; 

and the first expression alternative becomes

TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN { e = exprFactory->mkExpr(k.kind,*op1,*op2); } 

There has to be a better way to do things? Am I missing a configuration option to the C language backend? Is there another way to arrange my grammar to avoid this awkwardness? Is there a pure C++ backend I can use?

4
  • For those who are planning to answer, you could first check antlr.org/pipermail/antlr-interest/2010-February/037764.html to see if it wasn't already proposed to Chris. Commented Feb 24, 2010 at 21:42
  • I'm sure I don't know why asking in another forum warrants a downvote. I'll be happy to upvote/accept any useful answers here. Commented Feb 24, 2010 at 23:50
  • I'd say so if it were mine. Although I thought about downvoting it, I didn't since your question is a very good one: high level of detail, clear question, etc. But I do understand the downvote: asking in more than one forum without mentioning this in either one of them is IMO bad practise. I mean, why not just post a link in here to your post on the ANTLR mail list? That way, others can see what has already been answered and don't spend their time duplicating an (elaborate) answer in here which already was suggested somewhere else. Commented Feb 25, 2010 at 6:51
  • I guess I'd see your point more clearly if I had received any substantive answer in either forum. As it is, I'm about to break etiquette a second way and self-answer. Commented Feb 25, 2010 at 14:16

1 Answer 1

3

Here are the solutions I have found to this problem. The crux of the issue is that ANTLR wants to initialize all return values and attributes. For non-primitive types, ANTLR just assumes it can initialize with NULL. So, for example, the expression rule above will be translated into something like

static Expr expression(pExprParser ctx) { Expr e = NULL; // Declare and init return value Kind k; // declare attributes Expr op1, op2; k = NULL; // init attributes op1 = NULL; op2 = NULL; ... } 

The choices, as I see them, are these:

  1. Give the values primitive types that can legally be initialized to NULL. E.g., use Expr* and Kind* instead of Expr and Kind.

  2. Use the "dummy" trick, as above, to push the value into a structure where it won't be initialized.

  3. Use reference parameters instead of return values. E.g.,

    builtinOp[Kind& kind] : TOK_PLUS { kind = PLUS; } | TOK_MINUS { kind = MINUS; } ; 
  4. Augment the classes used as value types with operations that make the above declarations and initializations legal. I.e., for a Expr return value, you need a constructor that can take NULL:

    Expr(long int n); 

    For an Expr attribute, you need a no-arg constructor and an operator= that can take NULL:

    Expr(); Expr operator=(long int n); 

I know it is pretty hacky, but I'm going with #4 for the time being. It just so happens that my Expr class has a fairly natural definition of these operations.

P.S. On the ANTLR list, the maintainer of the C backend hints that this problem may be solved in future releases.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.