Generalised LR parsing
Standard LR parsers work in deterministic time on LR grammars, or
grammars which are almost LR such as the ANSI-C grammar (which
includes the well-known if-then-else ambiguity and is therefore not
strictly LR). Grammars which are non-LR result in LR-table conflicts which
must be resolved at parse time through some sort of search
mechanism. The best known approach to generalising LR parsers to
handle all possible grammars is due to Tomita. Tomita described
several algorithmic variants with later contributions from Farshi,
Rekers, Aycock and Horspool, and others. We have developed
enhancements to both the standard approach and the Aycock and Horspool
variants. As well as being more efficient, these parsers correctly
handle some difficult cases.
This technical report
contains a survey and analysis of previous work in the area along with
our new algorithms, proofs of their correctness and experimental results.