ABSTRACT

In recent years, with the continuous improvement of artificial intelligence (AI) theory, neural network models have been more and more widely used in various scenarios, especially in image processing, natural language processing, and audio and video processing. In specific tasks, the accuracy of artificial neural networks even exceeds human performance. In order to further improve the accuracy of artificial neural network recognition, over the past few years, neural networks are constantly moving toward deeper network hierarchies and more complex topologies so as to enable the network to abstract feature signals of higher dimensions. Higher hierarchies and more complex topologies allow neural networks to perform better and better in tasks. However, the complex network structure, with its large number of parameters, has brought the current computers more and more to the fore. One of the problems brought by a large number of computations is that, in order to complete a complete computation, the energy consumption consumed by the computer is also unbearable, especially for many embedded applications, so the problem of how to reduce the energy consumption of neural networks is a hot research issue in the previous two years. Another problem brought by a large number of computations is that in order to complete the computation, it is necessary to frequently access the parameters in the storage and frequently access to the storage of two aspects (the storage wall problem): one is that it will cause serious energy consumption problems and the other is that it will lead to serious delays, which is also unacceptable for real-time applications.