ABSTRACT

Artificial Intelligence (AI) hardware accelerators belong to a special category of chips dedicated to enhancing the performance and efficiency by which machine learning models can be deployed. They are useful in image, language, and pattern recognition tasks for various models, including but not limited to Convolutional Neural Networks, Artificial Neural Networks. This paper seeks to find bridges on how these accelerators are developed, the challenges confronted such as their speed, size or power consumption, and highlights hardware development efforts on effective performance. It also traces the developments in this area as well as the possible future of AI accelerators using concurrency, pipelining, and optimal memory management techniques. Finally the design is implemented on FPGA or ASICs as platforms providing hardware acceleration, enhancing the key components such as convolution engines, custom processing units. Optimization techniques are introduced to meet the trade-offs in semiconductor IC design technology by reducing size, cost and power dissipation and increasing through-put.