ANTLR Quick Start: Calculator Example

I will show you a simple calculator example built with ANTLR and Java to help you understand basic concepts and rules of ANTLR.

All the code shown below is in this repo

ANTLR

ANTLR(ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks[1].

Configuration

Add ANTLR to dependency and plugin:

<dependency>
    <groupId>org.antlr</groupId>
    <artifactId>antlr4-runtime</artifactId>
    <version>4.9.3</version>
</dependency>
<plugin>
    <groupId>org.antlr</groupId>
    <artifactId>antlr4-maven-plugin</artifactId>
    <version>4.9.3</version>
    <executions>
        <execution>
            <goals>
                <goal>antlr4</goal>
            </goals>
        </execution>
    </executions>
</plugin>

Important Note

  • Recompile project: ANTLR uses ".g4" files to define grammar, every time you edit it, you should recompile the project again to update the compiled files, which contains the code ANTLR autogenerates.
    If you use maven and Intellij for development, you can click on the maven sidebar to compile:
    antlr-compile

  • Mark generated source root if you are using Intellij: ANTLR automatically generates files in the directory: "target/generated-sources/anltr4", make sure it is marked as "generated source root", otherwise Intellij won't detect the auto generated code:
    antlr4-root-mark

Define Grammar

The calculator is a simple one, it can only handle integer, add, subtract, multiply and divide operations, like 1+2, 1*2, 1*(2+3). So the grammar only need to recognize integers and simple operation signal.

ANTLR uses ".g4" format files to define grammar. Rules starting with an uppercase letter are lexical (token) rules, rules starting with a lowercase letter are the parser rules.

The "#" symbol in the grammar file means label, with the labels, ANTLR can generate visitor method to let us visit it.

Note that the rule expr op=(MUL | DIV) expr is before expr op=(ADD | SUB) expr, it ensures that the multiple, divide operation is performed before the add or subtract operation.

grammar Expression;

// Rules starting with an uppercase letter comprise the lexical (token) rules
INT: [0-9]+; // match integers
NEWLINE:'\r'? '\n' ; // return newlines to parser (is end-statement signal)
WS : [ \t]+ -> skip; // toss out whitespace

MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';

// Rules starting with a lowercase letter comprise the parser rules
expr: expr op=(MUL | DIV) expr   # MulDiv
    | expr op=(ADD | SUB) expr   # AddSub
    | INT                        # int
    | '(' expr ')'               # parens
    ;

Use Visitor to Calculate

Based on the above grammar, ANTLR generates a visitor interface and default visitor implementation with a method for each labeled
name. By extending the default visitor implementation, we can perform the calculations.

public class EvalVisitor extends ExpressionBaseVisitor<Integer> {

    @Override
    public Integer visitParens(ExpressionParser.ParensContext ctx) {
        return visit(ctx.expr());
    }

    /** expr op=(MUL| DIV) expr */
    @Override
    public Integer visitMulDiv(ExpressionParser.MulDivContext ctx) {
        int left = visit(ctx.expr(0)); // get value of left subexpression
        int right = visit(ctx.expr(1)); // get value of right subexpression
        if ( ctx.op.getType() == ExpressionParser.MUL ) return left * right;
        return left / right;
    }

    /** expr op=(ADD| SUB) expr */
    @Override
    public Integer visitAddSub(ExpressionParser.AddSubContext ctx) {
        int left = visit(ctx.expr(0)); // get value of left subexpression
        int right = visit(ctx.expr(1)); // get value of right subexpression
        if ( ctx.op.getType() == ExpressionParser.ADD ) return left + right;
        return left - right;
    }

    /** INT */
    @Override
    public Integer visitInt(ExpressionParser.IntContext ctx) {
        return Integer.valueOf(ctx.INT().getText());
    }
}

Test

We use the following test to check if the calculator is working:

@Test
void mathTest() {
    String expression = "9 - 2 * (1 + 2) / 1";
    CharStream charStream = CharStreams.fromString(expression);
    ExpressionLexer lexer = new ExpressionLexer(charStream);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    ExpressionParser parser = new ExpressionParser(tokens);
    EvalVisitor visitor = new EvalVisitor();
    Integer value = visitor.visit(parser.expr());
    assertEquals(3, value);
}

References


  1. ANTLR website ↩︎