ABSTRACT

Hierarchical MAX model (HMAX) is a bio-inspired model mimicking the visual information processing of visual cortex. However, the visual processing of lower levels such as retina and lateral geniculate nucleus (LGN) is not concerned, and the properties of higher-level neurons are not sufficiently specified. Given that, we develop an extended HMAX model, denoted as E-HMAX, by the following biologically plausible ways. First, contrast normalization is conducted on the input image to simulate the processing of human retina and LGN. Second, Log Polar Gabor (GLoP) filters are used to simulate the properties of V1 simple cells instead of Gabor filters. Then, sparse coding on multi-manifolds (SCMM) is modelled to compute the V4 simple cell response instead of Euclidean distance. Meanwhile, a template learning method based on dictionary learning on multi-manifolds (DLMM) is proposed to select informative templates during template learning stage. Experimental results demonstrate that the proposed model has greatly outperformed the standard HMAX model. It is also comparable to some state-of-the-art approaches such as EBIM and OGHM-HMAX.