ABSTRACT

This chapter describes a system designed to locate facial features in images of human faces by using active perception and neural network techniques. Its primary application is in differential motion encoding for video-telephone systems. The bandwidth of video information transmitted down a telephone channel can be reduced by concentrating more of the available information rate on the expressive parts of the face. This requires a knowledge of where in the image the eye and mouth positions are located. The feature location system uses a number of neural networks which are taught on multiple resolution images. In recall, the networks can be used to target on the required feature in any face of a similar nature to the training examples. The preliminary results of this work were presented at the SPIE International Symposium on Advances in Intelligent Robotics Systems, 1989 (Luckman and Allinson, 1990). Here, we describe a new technique for locating features and compare the original system based on supervised feature maps with a similar one based on multi-layer perceptrons (MLPs).