ABSTRACT

Today most DSP and media processors introduce ILP (instruction level parallel) in their architecture designs. In traditional ILP processors with several function units (FUs), the main problem restraining their scale is the connection between shared register file (RF) and function units. When VLIW (Very Long Instruction Word) processors support several parallel operations, they will meet many problems because RF and bypassing logic[4][6] will become very complex[1]. Assuming an instruction has three register operands, two are of source and one is of destination, each FU requires two RF read ports to fetch source operands and one write port to write back the result. That is, if the number of FUs in the processor is n, the number of read/write ports on RF will be 2n/n respectively. If n increases in VLIW, this will have a serious negative impact on chip area and hardware consumption [2].