The Antlr parser has two phases: lexing and parsing.
- The lexer may skip characters the parser does not need to see, e.g. whitespace, comments, etc.
- The parser hides so called fragment rules from the parse tree.
We see that Antlr distinguishes the fact of hiding something from the result in a technical way. Parts of the grammar are attributed in different ways because the author of the grammar implicitly knows how the Altr framework works.
The Jslp (Javaslang Parser) has only one phase which combines lexing and parsing. The author of Jslp grammars should be able to attribute parts of the grammar in a uniform way. E.g. the Antlr lets us declare associativity of operators as <assoc=right>
and <assoc=left>
. Additionally it attributes rules as prefix fragment
and it lets us declare (lexer?) rule alternatives as -> skip
. That are three different ways to attribute something, which is too diverse, imo.
Therefore I suggest to simplify attributation, e.g. like this
rule<hidden> : alternative1
| alternative2<hidden>
| ( subrule1 | subrule2 )<hidden>
| INT op<assoc=right> INT
| ( '/*' ~'*/'* '*/' )<combined, hidden> // same as <combined=true, hidden=true>
;
WS : [ \t\r\n]+<hidden> ; // same as WS<combined, hidden> : [ \t\r\n]+ ;
Attributes are technically <key=value>
pairs and semantically properties of the attributed element. In the case of a boolean property, value may be omitted if it is true, i.e. <hidden=true>
is the same as <hidden>
.
Perhaps it is better for readability to add the rule attributes after ;
, like this:
rule : alternative1
| alternative2<hidden>
| ( subrule1 | subrule2 )<hidden>
| INT op<assoc=right> INT
| ( '/*' ~'*/'* '*/' )<combined, hidden> // same as <combined=true, hidden=true>
;<hidden>
WS : [ \t\r\n]+<hidden> ; // same as WS : [ \t\r\n]+ ;<combined, hidden>
But on the other hand, Java's annotations are prefixed, so we may also do the same here:
<hidden>
rule : alternative1
| alternative2<hidden>
| ( subrule1 | subrule2 )<hidden>
| INT op<assoc=right> INT
| ( '/*' ~'*/'* '*/' )<combined, hidden> // same as <combined=true, hidden=true>
;
WS : [ \t\r\n]+<hidden> ;
// same as:
// <combined, hidden>
// WS : [ \t\r\n]+ ;