1

Suppose I have a simple grammar which recognizes lower-case words. Among the words I have some reserved words that I need to address differently.

grammar test; prog: Reserved | Identifier; Reserved: 'reserved'; Identifier: [a-z]+; 

Given grammar above, is it promised that in case of "reserved" input always the Reserved token will be produced by lexer?

2
  • 1
    I assume [a-b] should be [a-z] Commented May 13 at 8:03
  • Yes. The lexer is not guided by the parser. Antlr lexers follow basic rules: 1) The lexer rule that matches the longest input string is selected. 2) If two or more lexer rules match the same string, the first is selected. 3) String literals in parser rules define new lexer rules before all explicit lexer rules internally if there isn't an explicit lexer rule that declares the string literal. In your grammar, Reserved occurs before Identifier so Reserved always matches. Always print out the token types of the tokens when degugging your grammar. Commented May 13 at 9:34

1 Answer 1

1

Yes, ANTLR's lexer operates using the following rules:

  1. try to create a token with the most amount of characters
  2. if there are 2 or more rules that match the same characters, let the one defined first "win"

Because the input "reserved" can be matched by both the Reserved and Identifier rule, the one defined first (Reserved) gets precedence.

It sometimes happens that keywords kan also be used as identifiers. This is often done by introducing a identifier parser rule that matches an Identifier token or some reserved keywords:

identifier : Reserved | Identifier; Reserved : 'reserved'; Identifier : [a-z]+; 

and then use identifier in other parser rules instead of directly using Identifier.

EDIT

After rereading your question: yes, parser rule alternatives are tried from left to right (top to bottom). In the rule p: p : a | b | c;, first a is tried, then b and lastly c.

Note that in your example prog: Reserved | Identifier;, there is no ambiguity since the input "reserved" will never become an Identifier token (the first part of my answer).

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.