ABSTRACT

The HMAX model developed by Serre et al. imitates the process of visual recognition in primates’ visual cortex. However, it has some limits in modelling the V2 neurons or higher level of visual cortex. This chapter extends the model in some biologically plausible ways and constructs a five-layer computational model, denoted as Sparse-HMAX model. First Gabor filters are used to describe the response properties of V1 neurons as in original HMAX model and describe C1 image patches with HOG descriptors. Then multiple firing k-means is integrated into the HMAX model to emulate the V2 neural responses and non-negative sparse coding to model V4 neurons. To investigate the efficacy of our proposed model, experiments are performed on three public databases: Caltech101, Caltech256, and GRAZ-01. Experimental results demonstrate that Sparse-HMAX model displays great improvements over the original HMAX model both in recognition accuracy and processing speed for object recognition. The proposed method is also comparable to the prevalent approaches in recognition performance.