1

I'm trying to create a grammar which allows certain expressions only as part of an expression and not at the very beginning of an expression statement. In the grammar below those certain expressions are arrow function expression and dot expression.

For example these expressions are valid:

a = b => c; a and b => c; x + y and z; x + y . z; 

while these are invalid (the first starts with a dot-expression, and the second is not part of expr nor assignment):

x . y + z; b => c; 

I tried a grammar like this:

parser grammar ExprParser; options { tokenVocab=ExprLexer; } program : (stat | def)+ EOF ; stat: ID '=' expr ';' | startExpr ';' | func ';' ; def : ID '(' ID? (',' ID)* ')' '{' stat* '}' ; arrowExpr : ID '=>' expr ; startExpr : atom | 'not' expr | startExpr '+' expr | startExpr 'and' expr | startExpr 'or' expr ; expr: atom | arrowExpr | 'not' expr | expr '+' expr | expr 'and' expr | expr 'or' expr | expr '.' expr ; atom: ID | INT | func ; func: ID '(' ID? (',' ID?)* ')'; lexer grammar ExprLexer; AND : 'and' ; OR : 'or' ; NOT : 'not' ; EQ : '=' ; COMMA : ',' ; SEMI : ';' ; LPAREN : '(' ; RPAREN : ')' ; LCURLY : '{' ; RCURLY : '}' ; PLUS : '+' ; MINUS : '-'; ARROW : '=>'; DOT : '.'; INT : [0-9]+ ; ID: [a-zA-Z_][a-zA-Z_0-9]* ; WS: [ \t\n\r\f]+ -> skip ; 

but this approach messes up operator precedence. x + y and z; should produce

Correct tree

but instead produces

Incorrect tree

Any ideas on how to solve this?

1 Answer 1

0

I added corresponding rule alternatives to your startExpr rule as follows and that solved it for me. Here is the modified rule:

startExpr : atom | 'not' expr | startExpr '+' startExpr | startExpr '+' expr | startExpr 'and' startExpr | startExpr 'and' expr | startExpr 'or' startExpr | startExpr 'or' expr ; 

Granted, this results in startExpr nodes rather than expr nodes, but I don't think your grammar given the rules can result in the ideal tree that you propose. This however, should still fix the precedence issue and will allow you build an AST, byte code and interpreter from it that should achieve your goals all the same.

The result is a bit more verbose, but does avoid having more gratuitous sub-rules. Since I'm not entirely sure of your intention with the language, I suspect there may be some other precedence issues with the arrow or dot expressions, but I could be wrong (since I'm basing that assumption off knowledge of some existing languages and how they use the operators).

Attached is the resulting graph:

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

I don't think that'll fix the issue with the order of precedence, it'll just make the result fit my example. I'm guessing that using both startExpr and expr in one alternative is messing up the left-recursive rule transformation that ANTLR does, but I don't know how to fix that while keeping the grammar nice and concise. Writing the precedence rules explicitly (startExpr: logicalOrExpr; logicalOrExpr : logicalAndExpr ( 'or' logicalAndExpr )* ; etc) solves the precedence issue, but is verbose and I don't like the duplication of expression logic (especially if I have more cases like it).
Ah yes, I see what you mean. Sorry, figured it was worth a shot. I'm tinkering with your grammar in my diagnostic tool to see if I can find a better solution that works.
I believe I found a solution for you and am editing the proposed solution to reflect it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.