Definet: Portable CNN Network for Facial Expression Recognition

doi:10.1201/9781003052098-23

ABSTRACT

Convolutional neural networks have gained much attention in the field of facial expression recognition (FER). However, high computation and memory costs limit their application on resource-limited devices. To address these challenges, in this paper, we proposed a DeFINet: Decoupled Features Integrated Network for FER. The DeFINet mainly consists of two blocks: decoupled Macro feature extraction (DMFExt) and appearance feature integration (AFIt). The DMFExt block follows decoupled behavior with very few parameters to capture abstract edge features from the spatial structure of facial expressions. Similarly, AFIt block enhances decoupled response feature quality by extracting fine-tuned features. Cohesively both blocks ensure that DeFINet has a smaller number of trainable parameters, leading to lower computation and memory costs with higher accuracy. We evaluated the proposed DeFINet on three standard datasets: MMI, CK+ and DISFA. The quantitative and qualitative comparison with state-of-the-art network VGG-16, VGG-19 and ResNet demonstrates the effectiveness of DeFINet, in both person dependent and person independent strategies.