ABSTRACT
Modern machine learning applications are often deployed in the cloud environment to exploit the computational power of clusters. However, this in-cloud computing scheme cannot satisfy the demands of emerging edge intelligence scenarios, including providing personalized models, protecting user privacy, adapting to real-time tasks, and saving resource cost. To conquer the limitations of conventional in-cloud computing, it comes the rise of tiny machine learning (TinyML), which makes the end-to-end ML procedure totally on user devices, without unnecessary involvement of the cloud. Despite the promising advantages of TinyML, implementing a high-performance learning system still faces with many severe challenges, such as insufficient user training data, backward propagation blocking and limited peak processing speed. Observing the substantial improvement space in the implementation and acceleration of TinyML systems, it is necessary to present a comprehensive analysis of the latest research progress and point out potential optimization directions from the system perspective. This book will present a software and hardware synergy of TinyML techniques, covering the scope of model-level neural network design, algorithm-level training optimization and hardware-level instruction acceleration. These techniques could bring fruitful discussions and inspire the researchers to further promote the field of edge intelligence.