ABSTRACT

Neural network models have been widely used to perform data analytics in various application domains such as vision sensing, speech recognition, gesture recognition, and many others. Until recently, a large amount of data had to be transmitted from the source of data to the servers where neural networks were deployed for analytics, adding latency and energy consumption for the transmission and processing of data. In the Internet-of-Things era, the trend of edge computing is becoming popular to deploy neural network models on edge devices such as always-on microcontroller systems which are near the source of the data to reduce latency and energy consumption. Due to the limited memory and compute capabilities of microcontroller systems, a popular approach is to train the model on servers, and then trained models are reduced in size using a quantization technique to deploy them on microcontroller systems where the inference is performed. Evaluating neural network models with different architectures on microcontroller systems with dissimilar computing and memory capacities is important for selecting an edge device for a particular application. In this paper, we evaluated the latency and energy consumption of three microcontroller systems by deploying neural network models with different architectures for vision sensing, keyword detection, and gesture recognition applications. Our results show that a microcontroller system with a Cortex-M4F processor has the lowest energy consumption for all three applications and Cortex-M4/M4F has lower latency.