4

I am working with ANTLR4 to generate AST of a java source code and i had to move to ANTLR3 because i was not getting much help and documentation and it was really tough to proceed.I managed to generate AST but not in a visual format. Then i came across an awesome answer and i was really able to generate AST in a DOT file but there was a slight problem.

My code:

import org.antlr.runtime.CommonTokenStream; import org.antlr.runtime.ANTLRFileStream; import org.antlr.runtime.tree.CommonTree; import org.antlr.runtime.tree.DOTTreeGenerator; import org.antlr.stringtemplate.StringTemplate; class Main { public static void main(String[] args) throws Exception { parseFile("/home/satnam-sandhu/Workstation/ASTGenerator/resource/java/Blabla.java"); } public static void parseFile(String f)throws Exception { JavaLexer lexer = new JavaLexer(new ANTLRFileStream(f)); CommonTokenStream tokens = new CommonTokenStream(lexer); JavaParser parser = new JavaParser(tokens); CommonTree tree = (CommonTree)parser.compilationUnit().getTree(); DOTTreeGenerator gen = new DOTTreeGenerator(); StringTemplate st = gen.toDOT(tree); System.out.println(st); } } 

I am using gradle so i build the project by:

gradle clean build 

And then run it and pipe the output to a dot file by:

java -jar ASTGenerator.jar > ast.dot 

Now the issue i am facing is that, for a source code of:

class example{ public static void print(int a){ int b = a +1; System.out.println(b); } public static void main(){ print(15); } } 

I am getting this as the output:

digraph { ordering=out; ranksep=.4; bgcolor="lightgrey"; node [shape=box, fixedsize=false, fontsize=12, fontname="Helvetica-bold", fontcolor="blue" width=.25, height=.25, color="black", fillcolor="white", style="filled, solid, bold"]; edge [arrowsize=.5, color="black", style="bold"] n0 [label=""]; n1 [label="class"]; n2 [label="example"]; n3 [label="{"]; n4 [label="public"]; n5 [label="static"]; n6 [label="void"]; n7 [label="print"]; n8 [label="("]; n9 [label="int"]; n10 [label="a"]; n11 [label=")"]; n12 [label="{"]; n13 [label="int"]; n14 [label="b"]; n15 [label="="]; n16 [label="a"]; n17 [label="+"]; n18 [label="1"]; n19 [label=";"]; n20 [label="System"]; n21 [label="."]; n22 [label="out"]; n23 [label="."]; n24 [label="println"]; n25 [label="("]; n26 [label="b"]; n27 [label=")"]; n28 [label=";"]; n29 [label="}"]; n30 [label="public"]; n31 [label="static"]; n32 [label="void"]; n33 [label="main"]; n34 [label="("]; n35 [label=")"]; n36 [label="{"]; n37 [label="print"]; n38 [label="("]; n39 [label="15"]; n40 [label=")"]; n41 [label=";"]; n42 [label="}"]; n43 [label="}"]; n0 -> n1 // "" -> "class" n0 -> n2 // "" -> "example" n0 -> n3 // "" -> "{" n0 -> n4 // "" -> "public" n0 -> n5 // "" -> "static" n0 -> n6 // "" -> "void" n0 -> n7 // "" -> "print" n0 -> n8 // "" -> "(" n0 -> n9 // "" -> "int" n0 -> n10 // "" -> "a" n0 -> n11 // "" -> ")" n0 -> n12 // "" -> "{" n0 -> n13 // "" -> "int" n0 -> n14 // "" -> "b" n0 -> n15 // "" -> "=" n0 -> n16 // "" -> "a" n0 -> n17 // "" -> "+" n0 -> n18 // "" -> "1" n0 -> n19 // "" -> ";" n0 -> n20 // "" -> "System" n0 -> n21 // "" -> "." n0 -> n22 // "" -> "out" n0 -> n23 // "" -> "." n0 -> n24 // "" -> "println" n0 -> n25 // "" -> "(" n0 -> n26 // "" -> "b" n0 -> n27 // "" -> ")" n0 -> n28 // "" -> ";" n0 -> n29 // "" -> "}" n0 -> n30 // "" -> "public" n0 -> n31 // "" -> "static" n0 -> n32 // "" -> "void" n0 -> n33 // "" -> "main" n0 -> n34 // "" -> "(" n0 -> n35 // "" -> ")" n0 -> n36 // "" -> "{" n0 -> n37 // "" -> "print" n0 -> n38 // "" -> "(" n0 -> n39 // "" -> "15" n0 -> n40 // "" -> ")" n0 -> n41 // "" -> ";" n0 -> n42 // "" -> "}" n0 -> n43 // "" -> "}" } 

When using http://viz-js.com/ for visualising the output is like this: enter image description here

All my work till now is uploaded here if you guys feel like to dig deeper into the grammar file i am using. I think options specified in the grammar file can also be the reason. I am a beginner cannot proceed without a little help. Thanks in advance. :)

3
  • 1
    You need to mark root nodes in the Java.g file as shown in this Q&A: stackoverflow.com/questions/4931346/… Commented Jan 25, 2018 at 7:43
  • That was helpful. Thanks @Bart Kiers. But i am using the grammar available in antlr's official repository so is there a way to get a Java grammar with already marked nodes? Commented Jan 25, 2018 at 8:56
  • Not that I know of. You'll have to add them to the grammar yourself. You're welcome, of course. Commented Jan 25, 2018 at 9:10

1 Answer 1

7

The answer i was looking for is answered here by Bart Kiers but for people who want to generate DOT files without modifying the grammar can print a intended syntax tree by taking help from this repository. Since i did not find much documentation on dot generation in ANTLR4 and no other option other than to modify ANTLR3 grammar file, i used Federico Tomassetti example and modified it a little to generate our own DOT file.

You can print a Dot file output by:

import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.nio.charset.Charset; import java.nio.file.Files; import org.antlr.v4.runtime.ANTLRInputStream; import org.antlr.v4.runtime.CommonTokenStream; import org.antlr.v4.runtime.ParserRuleContext; import org.antlr.v4.runtime.RuleContext; import org.antlr.v4.runtime.tree.ParseTree; public class ASTGenerator { static ArrayList<String> LineNum = new ArrayList<String>(); static ArrayList<String> Type = new ArrayList<String>(); static ArrayList<String> Content = new ArrayList<String>(); private static String readFile() throws IOException { File file = new File("resource/java/Blabla.java"); byte[] encoded = Files.readAllBytes(file.toPath()); return new String(encoded, Charset.forName("UTF-8")); } public static void main(String args[]) throws IOException{ String inputString = readFile(); ANTLRInputStream input = new ANTLRInputStream(inputString); Java8Lexer lexer = new Java8Lexer(input); CommonTokenStream tokens = new CommonTokenStream(lexer); Java8Parser parser = new Java8Parser(tokens); ParserRuleContext ctx = parser.compilationUnit(); generateAST(ctx, false, 0); System.out.println("digraph G {"); printDOT(); System.out.println("}"); } private static void generateAST(RuleContext ctx, boolean verbose, int indentation) { boolean toBeIgnored = !verbose && ctx.getChildCount() == 1 && ctx.getChild(0) instanceof ParserRuleContext; if (!toBeIgnored) { String ruleName = Java8Parser.ruleNames[ctx.getRuleIndex()]; LineNum.add(Integer.toString(indentation)); Type.add(ruleName); Content.add(ctx.getText()); } for (int i = 0; i < ctx.getChildCount(); i++) { ParseTree element = ctx.getChild(i); if (element instanceof RuleContext) { generateAST((RuleContext) element, verbose, indentation + (toBeIgnored ? 0 : 1)); } } } private static void printDOT(){ printLabel(); int pos = 0; for(int i=1; i<LineNum.size();i++){ pos=getPos(Integer.parseInt(LineNum.get(i))-1, i); System.out.println((Integer.parseInt(LineNum.get(i))-1)+Integer.toString(pos)+"->"+LineNum.get(i)+i); } } private static void printLabel(){ for(int i =0; i<LineNum.size(); i++){ System.out.println(LineNum.get(i)+i+"[label=\""+Type.get(i)+"\\n "+Content.get(i)+" \"]"); } } private static int getPos(int n, int limit){ int pos = 0; for(int i=0; i<limit;i++){ if(Integer.parseInt(LineNum.get(i))==n){ pos = i; } } return pos; } } 

For a source code like this:

class example{ public static void main(){ int a; a = 5; } } 

Output will be:

digraph G { 00[label="compilationUnit\n classexample{publicstaticvoidmain(){inta;a=5;}}<EOF> "] 11[label="normalClassDeclaration\n classexample{publicstaticvoidmain(){inta;a=5;}} "] 22[label="classBody\n {publicstaticvoidmain(){inta;a=5;}} "] 33[label="methodDeclaration\n publicstaticvoidmain(){inta;a=5;} "] 44[label="methodModifier\n public "] 45[label="methodModifier\n static "] 46[label="methodHeader\n voidmain() "] 57[label="result\n void "] 58[label="methodDeclarator\n main() "] 49[label="block\n {inta;a=5;} "] 510[label="blockStatements\n inta;a=5; "] 611[label="localVariableDeclarationStatement\n inta; "] 712[label="localVariableDeclaration\n inta "] 813[label="integralType\n int "] 814[label="variableDeclaratorId\n a "] 615[label="expressionStatement\n a=5; "] 716[label="assignment\n a=5 "] 817[label="expressionName\n a "] 818[label="assignmentOperator\n = "] 819[label="literal\n 5 "] 00->11 11->22 22->33 33->44 33->45 33->46 46->57 46->58 33->49 49->510 510->611 611->712 712->813 712->814 510->615 615->716 716->817 716->818 716->819 } 

Insert this piece of output in http://viz-js.com/ You will get this as ouput:

You can also pipe the output to ast.dot file by:

java -jar path-to-jar-file.jar > ast.dot 

Now this is not the perfect method but enough for me. :)

Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.