ABSTRACT

A bottom-up parser verifies the syntax of a tokenized source program by constructing the parse tree for this program in a left-to-right and bottom-up way. That is, it reads the input string representing the tokenized program from left to right and works up towards the root of the parse tree. To put it in terms of derivations, it builds up the rightmost derivation of the input string in reverse so it starts from the string and proceeds towards the start symbol. Each step of this parsing process represents a shift or a reduction. The former consists in shifting the current input symbol onto the pushdown. During a reduction, the parser selects a handle⎯that is, an occurrence of the right-hand side of a rule in the current sentential form, and after this selection, it reduces the handle to the rule’s left-hand side so that this reduction, viewed in reverse, represents a rightmost derivation step. If the parser always precisely determines how to make each step during this bottom-up parsing process, then it works deterministically, and the deterministic bottom-up parsers obviously fulfill a key role in practice. In this chapter, we discuss two fundamental deterministic bottom-up parsing methods. First, we describe precedence parsing, which is a popular deterministic bottom-up parsing method for expressions whose operators and their priorities actually control the parsing process. Then, we describe LR parsing, where L stands for a left-to-right scan of the input string and R stands for a rightmost derivation constructed by the parser. LR parsers are as powerful as the deterministic pushdown automata, so they represent the strongest possible parsers that work in a deterministic way. That is probably why they are so often implemented in reality, and we discuss them in detail later in this chapter.