Generalised LR parsing

Standard LR parsers work in deterministic time on LR grammars, or grammars which are almost LR such as the ANSI-C grammar (which includes the well-known if-then-else ambiguity and is therefore not strictly LR). Grammars which are non-LR result in LR-table conflicts which must be resolved at parse time through some sort of search mechanism. The best known approach to generalising LR parsers to handle all possible grammars is due to Tomita. Tomita described several algorithmic variants with later contributions from Farshi, Rekers, Aycock and Horspool, and others. We have developed enhancements to both the standard approach and the Aycock and Horspool variants. As well as being more efficient, these parsers correctly handle some difficult cases.

This technical report contains a survey and analysis of previous work in the area along with our new algorithms, proofs of their correctness and experimental results.