ABSTRACT

The greater power of multi-layer networks was realized long ago, it was only recently shown how to make them learn a particular function, using "back-propagation" or other methods. This absence of a learning rule-together with the demonstration by Minsky and Papert that only linearly separable functions could be represented by simple perceptrons-led to a waning of interest in layered networks until recently. The back-propagation algorithm is central to much current work on learning in neural networks. Back-propagation has been much studied in the past few years, and many extensions and modifications have been considered. The basic algorithm is exceedingly slow to converge in a multi-layer network, and many variations have been suggested to make it faster. Other goals have included avoidance of local minima, and improvement of generalization ability. A different approach to training a layered network is based on optimizing the internal representations of the input patterns by the hidden layers.