If I had to guess, it looks like this Postgres parser grammar was produced by someone taking a grammar designed for a different parser generator and translating it as literally as possible. But it's got a lot of content that is very bad ANTLR.
For example, you see this pattern a lot:
from_clause
: FROM from_list
|
;
Having a rule end in | ; means it will always be considered a valid match, even if it contains no content. This is painful to work with, because now in the visitor, on a rule that contains this as a sub-rule, you can't simply say if (context.from_clause() != null) to see if you have a real match; the code to check for it is significantly messier.
The idomatic way to do this in ANTLR is to define it as:
from_clause
: FROM from_list
;
and then have any rule that uses it invoke it as from_clause?.
Also, some things are just really weird. For example, the following:
from_list
: non_ansi_join
| table_ref (COMMA table_ref)*
;
non_ansi_join
: table_ref (COMMA table_ref)+
;
Unless I'm overlooking some crucial detail, the non_ansi_join rule here is entirely superfluous because the alternative branch of from_list is a strict superset of it. Again, this feels like it was translated overly-literally from some other parser generator's grammar.
Would it be possible to clean this grammar up a bit?
If I had to guess, it looks like this Postgres parser grammar was produced by someone taking a grammar designed for a different parser generator and translating it as literally as possible. But it's got a lot of content that is very bad ANTLR.
For example, you see this pattern a lot:
Having a rule end in
| ;means it will always be considered a valid match, even if it contains no content. This is painful to work with, because now in the visitor, on a rule that contains this as a sub-rule, you can't simply sayif (context.from_clause() != null)to see if you have a real match; the code to check for it is significantly messier.The idomatic way to do this in ANTLR is to define it as:
and then have any rule that uses it invoke it as
from_clause?.Also, some things are just really weird. For example, the following:
Unless I'm overlooking some crucial detail, the
non_ansi_joinrule here is entirely superfluous because the alternative branch offrom_listis a strict superset of it. Again, this feels like it was translated overly-literally from some other parser generator's grammar.Would it be possible to clean this grammar up a bit?