ABSTRACT

Ever since the advent of reduced instruction set computers (RISC) [67, 122] and their pipelined execution of instructions, instruction scheduling techniques have gained importance as they rearrange instructions to “cover” the delay or latency that is required between an instruction and its dependent successor. Without such reordering, pipelines would stall, resulting in wasted processor cycles. Pipeline stalls would also occur while executing control transfer instructions, such as branch and jump instructions. In architectures that support delayed branching, where the control transfers are effected in a delayed manner [68], instruction reordering is again useful to cover the stall cycles with useful instructions. Instruction scheduling can be limited to a single basic block — a region of straight line code with a single point of entry and a single point of exit, separated by decision points and merge points [3, 73, 107] — or to multiple basic blocks. The former is referred as basic block scheduling, and the latter as global scheduling [3, 73, 107].