Skip to content


ANTLR inheritance issues

An update to my earlier post, here’s a set of examples where ANTLR’s inheritance-system fails in an unexpected way. In this example, we parse a sequence of words or integers, printing out what we get as we go. Code below the fold. First, the lexer:
lexer grammar testLexer;

INT	: ('0'..'1')+
	;

WORD
	: ('a'..'z')+
	;

WHITESPACE
	: (' '|'\t'|'\n'|'\r')+
	{skip();}
	;
The parent parser:
parser grammar testBase;
options{
	tokenVocab=testLexer;
}

bunch	:	thingy+
	;

thingy	:
	w=foo
	{
	System.out.println("FoundWord: "+$w.text);
	}
	;

foo	:	WORD;
bar	:	INT;
And finally, the child which overrides the parent, by allowing integers within the sequence.
parser grammar testChild;
options{
	tokenVocab=testLexer;
}

import testBase;

// Override parent's handling of thingy: We want to allow integers through the "bar" subrule
thingy
	:wval=foo
	{
	System.out.println("FoundWord: "+$wval.text);
	}
	|ival=bar
	{
	System.out.println("FoundInt: "+$ival.text);
	}
	;
Running the testBase parser will properly handle things like “foo bar baz”, but fail on “foo 25 baz”. From the looks of it, the testChild parser should be able to handle this second case, right? Not quite. Here’s a snippet of the generated Java code. See if you can see the error which prevents it from compiling.
    // c:\\tmp\\testChild.g:11:1: thingy : (wval= foo | ival= bar );
    public final void thingy() throws RecognitionException {
        testChild_testBase.foo_return wval = null;

        try {
            // c:\\tmp\\testChild.g:12:2: (wval= foo | ival= bar )
            int alt1=2;
            int LA1_0 = input.LA(1);

            if ( (LA1_0==WORD) ) {
                alt1=1;
            }
            else if ( (LA1_0==INT) ) {
                alt1=2;
            }
            else {
                NoViableAltException nvae =
                    new NoViableAltException("", 1, 0, input);

                throw nvae;
            }
            switch (alt1) {
                case 1 :
                    // c:\\tmp\\testChild.g:12:3: wval= foo
                    {
                    pushFollow(FOLLOW_foo_in_thingy39);
                    wval=foo();

                    state._fsp--;
                    	System.out.println("FoundWord: "+(wval!=null?input.toString(wval.start,wval.stop):null));
                    }
                    break;
                case 2 :
                    // c:\\tmp\\testChild.g:16:3: ival= bar
                    {
                    pushFollow(FOLLOW_bar_in_thingy48);
                    bar();

                    state._fsp--;
                    	System.out.println("FoundInt: "+ival.text);
                    }
                    break;
            }
        }
        catch (RecognitionException re) {
            reportError(re);
            recover(input,re);
        }
        finally {
        }
        return ;
    }
It won’t compile because “ival” on line 41 is never set. It ought to be assigned on line 38, the same way that “wval” is set on line 28, but for some reason ANTLR misses out on this. There’s a way to work around it, but along with some of the other problems it makes me not want to even try to use the feature.

Posted in Programming.

Tagged with .