ABSTRACT

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 12.2 Synthesis of VLSI Processor Array Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

12.2.1 Traditional Space-Time Mapping Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 12.2.2 A Constraint-Driven Unified Array Synthesis Methodology . . . . . . . . . . . . . . . . . 380

12.2.2.1 Data Dependence Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . 380 12.2.2.2 Determining the Space Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . 380 12.2.2.3 Constraint-Driven Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 12.2.2.4 Determining All Permissible Linear Schedules . . . . . . . . . . . . . . . . . . . 382 12.2.2.5 Applying the Space-Time Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

12.3 HOM Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 12.3.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 12.3.2 LRA Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

12.3.2.1 Data Dependence Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . 383 12.3.2.2 Determining the Space Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . 385 12.3.2.3 Constructing a Suitable LRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

12.3.3 Moments Generating Triangular Array Architectures . . . . . . . . . . . . . . . . . . . . . . . 386 12.4 Fourth-Order Cumulants Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

12.4.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 12.4.2 LRA Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

12.4.2.1 Data Dependence Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . 390 12.4.2.2 Determining the Space Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . 391 12.4.2.3 Construction of the LRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 12.4.2.4 A Farm of Processors Cumulants Generating Architecture . . . . . . . . 393

12.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

Higher order statistics (HOS), such as the higher order moments (HOM) and cumulants, have been recognized as important tools in modern signal processing and data analysis, because they overcome well-known limitations of the autocorrelation/power spectrum based second-order methods. We present here the systematic synthesis of a modular VLSI processor array architecture that can compute in real-time HOM and cumulants up to any specified maximal order. We describe first the steps of a novel design methodology that can be followed to explore the large space of available solutions to identify and characterize those architectures that meet certain designer-imposed constraints. Then we use it to (i) systematically formulate locally recursive algorithms (LRA) and linear space-time operators that can transform them into planar parallel array architectures with optimal characteristics for the real-time estimation of all moment lags, up to any desirable order higher than three and (ii) determine how the moments generating VLSI architecture can be minimally extended to provide in real-time estimates of the fourth-order cumulants, whereas still retaining its optimal characteristics.