ABSTRACT

1 INTRODUCTION

CMU, with its industrial partners, has developed a 32-bit floating-point programmable systolic array for the high-speed execution of many essential computations in signal and image processing, such as the fast Fourier transform (FFT) and convolution. This is a one-dimensional systolic array that in general takes inputs from one end cell and produces outputs at the other end, with data and control all flowing in one direction. The initial version of the machine has 10 cells, each of which is capable of performing 10 million floating-point operations per second (10 MFLOPS) and is built on a single board using only off-the-shelf components. We call this machine the Warp processor, suggesting that it can perform various transformations at a very high spped.