ABSTRACT

The breakdown of Dennard scaling coupled with the persistently growing transistor counts severally increased the importance of application-specific hardware acceleration; such approach offers significant performance and energy benefits compared to general-purpose solutions. This chapter aims at assisting designers in taking full advantage and making optimal use of the Xilinx Vivado HLS methodology when implementing a fundamental matrix multiplication algorithm which is used in a variety of application domains. During the project creation, Vivado HLS provides a simple wizard in order to specify the name of the project, the type of platform as well as the clock period. The Product loop is the inner most loop performing the actual Matrix elements product and sum. The Col loop is the outer-loop which feeds the next column element data with the passed row element data to the Product loop.