ABSTRACT
CONTENTS 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
12.1.1 Brief History of the Origins of ANNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752 12.2 Basic Concepts and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754 12.3 Learning Algorithms and Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757 12.4 Perceptrons and Multilayer Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
12.4.1 Classification of Data into Two Classes: Basic Geometry . . . . . . . . . . . . . . . 759 12.4.2 Perceptron Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 12.4.3 Multilayer Perceptrons and the Backpropagation Algorithm . . . . . . . . . . . . . 765 12.4.4 Backpropagation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766
12.4.4.1 Backpropagation with Momentum . . . . . . . . . . . . . . . . . . . . . . . 771 12.5 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
12.5.1 SVMs for Linearly Separable Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773 12.5.2 Construction of the SVM Solution in the Linearly Separable Case . . . . . . . . . 775 12.5.3 SVM with Soft Margins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776 12.5.4 SVM for Linearly Nonseparable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 778
12.6 RBF Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783 12.6.1 Interpolation with Fewer RBFs than Data Points . . . . . . . . . . . . . . . . . . . . . 784
12.7 Universal Approximation Using Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 784 12.8 Neural Networks for Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
12.8.1 Gradient Dynamical Systems for Unconstrained Optimization . . . . . . . . . . . . 786 12.8.2 Liapunov Stability of Gradient Dynamical Systems . . . . . . . . . . . . . . . . . . . 787 12.8.3 Hopfield-Tank Neural Networks Written as
Gradient Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788 12.8.3.1 Hopfield-Tank Network as Associative Memory . . . . . . . . . . . . . . 789 12.8.3.2 Hopfield-Tank Net as Global Optimizer . . . . . . . . . . . . . . . . . . . 790
12.8.4 Gradient Dynamical Systems That Solve Linear Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
12.8.5 Discrete-Time Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792 12.9 Annotated Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793
12.9.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794 12.9.2 Journals That Publish ANN-Related Research . . . . . . . . . . . . . . . . . . . . . . . 794 12.9.3 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795
12.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796 12.10.1 Some Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796
12.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797
ABSTRACT This chapter contains a brief history of the origins of artificial neural networks (ANNs), subsequently discussing the main learning algorithms and paradigms. These include perceptrons and multilayer perceptrons and a presentation of the backpropagation algorithm for the problem of classifying data. Support vector machines are then discussed for the cases of both linearly separable as well as nonseparable data, followed by a discussion on radial basis functions networks and interpolation using such networks. A discussion of universal approximations serves to contextualize the mathematical properties of neural networks. Neural networks for optimization are presented from the point of view of gradient dynamical systems. The application of these dynamical neural networks to the real-time solution of linear systems is presented. The chapter ends with an annotated bibliography, a list of neural network software suites, a list of journals that publish ANN-related research, and a list of websites that make data sets publicly available.