The RDP parser generator
RDP compiles attributed LL(1) grammars decorated with C-language
semantic actions into recursive descent parsers. It can automatically
build Abstract Syntax Trees using our Reduced Derivation Tree
model. It has built-in support for symbol table handling, set
manipulation and generalised graph representation.
RDP is written in strict ANSI C and produces strict ANSI C. RDP was
developed using Borland C++ versions 3.1 and 5.1 on a PC and has also
been built with no problems on Alpha/OSF-1, DECstation/Ultrix, HP
Apollo/HPUX, Sun 4/SunOS, Solaris, Linux and NetBSD 0.9 hosts all using
GCC as well as variety of vendor's own compilers. RDP also compiles for
MS-DOS under Microsoft C V7.0, gcc (using the djpgg port) and several
other compilers. I have reports of successful builds on Mac, Amiga and
the Acorn Archimedes.
RDP is C++ 'clean' i.e. there are no identifiers used that clash with C++
reserved words. RDP generated code has been used with g++ applications, and
compiles with g++ and the Borland C++ (as opposed to ANSI) compiler.
RDP itself, and the language processors it generates, use standard library
modules to manage symbol tables, sets, graphs, memory allocation, text
buffering, command line argument processing and scanning. The RDP scanner is
programmed by loading tokens into a symbol table at the start of each run. In
principle, the RDP scanner can be used to support runtime extensible
languages, such as user defined operators in Algol-68, although nobody has had
the nerve to try this yet.
RDP produces complete runnable programs with built-in help information and
command line switches that are specified as part of the EBNF file. In this
sense RDP output is far more shrink-wrapped than the usual parser generators
which helps beginning students.
The RDP text buffering routines automatically handle nested files, error
message reporting and text data buffering to provide an efficient general
purpose front end. This is also a great help to new users since writing
efficient (and correct) text buffering and scanning routines from scratch is,
in my experience, harder than it appears.
The RDP graph handling package provides a general framework for building
graph data structures that may then be output in a form suitable for display
with the VCG (Visualisation of Compiler Graphs) tool. RDP generated parsers
can be set to automatically build derivation trees in a form suitable for
human viewing.
RDP generates itself (you mean you use a parser generator which _isn't_
written in itself?) which is a nice demonstration of the bootstrapping
technique used for porting compilers to new architectures.
What you get
- The machine generated source for the RDP translator (rdp.c). RDP checks that
the source grammar is LL(1) and explains exactly why a non-LL(1) grammar is
unacceptable. This version of RDP does not attempt to rework a grammar by
itself.
- The decorated EBNF file describing RDP that was processed by the RDP
executable to produce its own source code (rdp.bnf). This is good for
boggling undergraduate's minds with.
- Decorated EBNF files for the languages minicalc, minicond, miniloop
and minitree that are used as examples in the case study manual. The
languages illustrate the development of a simple programming language
by way of a syntax checker, two interpreters and finally two syntax-directed
compilers that produces assembly language for a mythical machine called
MVM (Mini Virtual Machine). On of the compilers is a single pass
translator and the other uses a tree-based intermediate form
- The decorated EBNF file for mvmasm, an assembler for the language
produced by the above compilers along with a simulator called mvmsim for the
resulting executable files.
- An EBNF file describing a not particularly standard Pascal with some
extensions for Turbo Pascal which generates a fully working parser.
- A set of functions to automate the handling of command line arguments (arg.c).
- Routines to implement general graph data structures (graph.c).
- A set of wrapper functions for the standard C memory allocation routines
with built in fatal error handling for out of memory errors (memalloc.c).
- A programmable scanner with integrated error handling (scan.c and scanner.c).
- A set-handling package that supports dynamically sizable sets (set.c).
- A hash-coded symbol table with support for multiple symbol tables,
nested scope rules and arbitrary user data (symbol.c).
- A standard text buffering package with integrated messaging utilities that
are used for all communication with the user (textio.c).
- Sources and makefiles for everything which you may use freely on condition
that you send copies of any modifications, enhancements and ill-conceived
changes you might make back to me so that I can improve RDP.
- User, library support, tutorial and case study manuals.
Versions and support
RDP has had six main releases including the original 1.0 release in
February 1994. The current version is 1.5, released in May 1998.
RDP has now been used pretty ferociously by lots of people in industry
and academia and has stood up very well. We intend to stabilise RDP
with this release, although we will, of course, continue to respond
to bug reports.
We have an internal beta release (version 1.6) that you
can pick up from the gtb page if you are brave. Version 1.6 has a
lot of extra features designed to support advanced users and is
mainly intended as a bootstrap vehicle for GTB, our new Grammar
ToolBox tool. RDP version 1.6 is rather profligate in its use of memory,
and is not as well documented as version 1.5, so our advice is to use
version 1.5 in the first instance and contact us before making a decision to move up to
the later version.