Have a read of this help page, in particular the part about simplifying the problem. There are lots of rules in your grammar file that were not needed to reproduce the problem. We did not need:
%token BEGINtoken %token COMMAtoken %token ENDtoken %token EOFtoken %token EQtoken %token <value> IDtoken %token IStoken %token PROGRAMtoken %token VARtoken %% start: PROGRAMtoken IDtoken IStoken compoundstatement compoundstatement: BEGINtoken {print_header();} statement semi ENDtoken {print_end();} semi: SEMItoken statement semi | | declaration declaration: VARtoken IDtoken comma comma: COMMAtoken IDtoken comma |
You could have just removed these tokens and rules to get to the heart of the operator precedence issue. We did not need any variables, declarations, assignments or program structure to illustrate the failure. Learning to simplify is the heart of competent debugging and thus programming.
I knew what the issue was on inspection of the grammar, but to test my solution I had to code up a working lexer, some symbol table routines, a main program and other ancillary code.
Lets get to the heart of the problem. You have these token declarations:
%left DIVIDEtoken %left TIMEStoken %left PLUStoken MINUStoken
These tell yacc that if any rules are ambiguous that the operators associate left. Your rules for these operators are:
exp: exp PLUStoken term { $$ = $1 + $3; } | exp MINUStoken term { $$ = $1 - $3; } | term { $$ = $1; } | MINUStoken term { $$ = -$2;} term: factor { $$ = $1; } | factor TIMEStoken term {$$ = $1 * $3; } | factor DIVIDEtoken term { $$ = $1 / $3; }
However, these rules are not ambiguous, and thus the operator precedence declaration is not required. Yacc will follow the non-ambiguous grammar you have used. The ways these rules are written, it tells yacc that the operators have right associativity, which is the opposite of what you want. Now, it could be clearly seen from the simple arithmetic in your example that the operators were being calculated in a right associative way, and you wanted the opposite. There were really big clues there weren't there?
OK. How to change the associativity? One way would be to make the grammar ambiguous again so that the %left declaration is used, or just flip the rules around to invert the associativity. That's what I did:
exp: term PLUStoken exp { $$ = $1 + $3; } | term MINUStoken exp { $$ = $1 - $3; } | term { $$ = $1; } | MINUStoken term { $$ = -$2;} term: factor { $$ = $1; } | term TIMEStoken factor {$$ = $1 * $3; } | term DIVIDEtoken factor { $$ = $1 / $3; }
Do you see what I did there? I rotated the grammar rule around the operator.
Now for some more disclaimers. I said this is a dumb exercise. Interpreting expressions on the fly is a poor use of the yacc tool, and not what happens in real compilers or interpreters. In a realistic implementation, a parse tree would be built and the value calculations would be performed during the tree walk. This would then enable the issues of undeclared variables to be resolved (which also occurs in this exercise). The use of the regs array to store values is also dumb, because there is clearly an ancillary symbol table in use to return a unique integer ID for the symbols. In a real compiler/interpreter those values would also be stored in that symbol table.
I hope this tutorial has helped further students understand these parsing issues in their classwork.