Syntax-Directed Translation and Intermediate Code Generation | 9

ABSTRACT

If the source program is lexically and syntactically correct, the compiler translates this program into its machine-independent intermediate representation. This translation is usually directed by the syntax analyzer during the parse of the program, hence syntax-directed translation. During a parsing step according to a rule, this translation performs an action with attributes that are attached to the pushdown symbols the parsing step is performed with. Regarding the intermediate code generation, these attributes usually address fragments of the code produced so far from which the attached action creates a larger piece of the generated code by composing these fragments together. In this way, the action generates the code that determines how to interpret the rule it is attached to. From a broader perspective, these actions bridge the gap between the syntax analysis of a source program and its interpretation in terms of intermediate code. In this sense, they actually define the meaning of the rules they are attached to, and that is why these actions are often called semantic actions. As explained in Section 3.2, the syntax analyzer verifies the syntactical structure of a source program by finding its parse tree. However, the syntax-directed translation does not implement this tree as the intermediate representation of the program because it contains several superfluous pieces of information, such as nonterminals, which are irrelevant to the rest of the compilation process. Instead, the syntax-directed translation usually produces an abstract syntax tree or, briefly, a syntax tree as the actual intermediate representation of the program. Derived from a parse tree, the syntax tree still maintains the essential source-program syntax skeleton needed to generate the resulting target code properly. However, it abstracts from any immaterial information regarding the remaining translation process, so it represents a significantly more succinct and, therefore, efficient representation of the program than its parse-tree original. Besides the syntax trees, we describe the generation of two important non-graphic intermediate representations, such as three-address code and postfix notation. Besides the generation of the intermediate code, however, the syntax-directed translation also performs various checks of the semantic analysis, such as the type checking. Synopsis. In this chapter, we discuss the syntax-directed translation of the tokenized source programs to their intermediate representation. Most importantly, we explain how to generate the abstract syntax trees from the source programs in this way. Besides these trees, we describe the generation of an important non-graphic intermediate representation called three-address code. In Section 6.1, we explain how a bottom-up parser directs the generation of various types of intermediate code for conditions and expressions. This variety of the generated intermediate code includes syntax trees, three-address code, and postfix notation. Then, we sketch the syntax directed translation under the guidance of a top-down parser in Section 6.2. Specifically, we describe what actions this translation makes during the top-down parse of declarations. In Section 6.3, we discuss the semantic analysis of a source program. As already noted, by analogy with the generation of the intermediate code, the semantic analysis consists in performing actions directed by a parser. A special attention is paid to the verification of the most common semantic aspects of the source program, such as type checking. As the semantic analysis heavily uses the symbol table, we outline some common symbol-table organizations in Section 6.4. Finally, in Section 6.5, we describe basic software tools that allow us to create a lexical analysis and syntax-directed translation automatically.