ABSTRACT

CONTENTS 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752

12.1.1 Brief History of the Origins of ANNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752 12.2 Basic Concepts and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754 12.3 Learning Algorithms and Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757 12.4 Perceptrons and Multilayer Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759

12.4.1 Classification of Data into Two Classes: Basic Geometry . . . . . . . . . . . . . . . 759 12.4.2 Perceptron Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 12.4.3 Multilayer Perceptrons and the Backpropagation Algorithm . . . . . . . . . . . . . 765 12.4.4 Backpropagation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766

12.4.4.1 Backpropagation with Momentum . . . . . . . . . . . . . . . . . . . . . . . 771 12.5 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771

12.5.1 SVMs for Linearly Separable Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773 12.5.2 Construction of the SVM Solution in the Linearly Separable Case . . . . . . . . . 775 12.5.3 SVM with Soft Margins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776 12.5.4 SVM for Linearly Nonseparable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 778

12.6 RBF Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783 12.6.1 Interpolation with Fewer RBFs than Data Points . . . . . . . . . . . . . . . . . . . . . 784

12.7 Universal Approximation Using Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 784 12.8 Neural Networks for Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785

12.8.1 Gradient Dynamical Systems for Unconstrained Optimization . . . . . . . . . . . . 786 12.8.2 Liapunov Stability of Gradient Dynamical Systems . . . . . . . . . . . . . . . . . . . 787 12.8.3 Hopfield-Tank Neural Networks Written as

Gradient Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788 12.8.3.1 Hopfield-Tank Network as Associative Memory . . . . . . . . . . . . . . 789 12.8.3.2 Hopfield-Tank Net as Global Optimizer . . . . . . . . . . . . . . . . . . . 790

12.8.4 Gradient Dynamical Systems That Solve Linear Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791

12.8.5 Discrete-Time Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792 12.9 Annotated Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793

12.9.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794 12.9.2 Journals That Publish ANN-Related Research . . . . . . . . . . . . . . . . . . . . . . . 794 12.9.3 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795

12.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796 12.10.1 Some Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796

12.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797

ABSTRACT This chapter contains a brief history of the origins of artificial neural networks (ANNs), subsequently discussing the main learning algorithms and paradigms. These include perceptrons and multilayer perceptrons and a presentation of the backpropagation algorithm for the problem of classifying data. Support vector machines are then discussed for the cases of both linearly separable as well as nonseparable data, followed by a discussion on radial basis functions networks and interpolation using such networks. A discussion of universal approximations serves to contextualize the mathematical properties of neural networks. Neural networks for optimization are presented from the point of view of gradient dynamical systems. The application of these dynamical neural networks to the real-time solution of linear systems is presented. The chapter ends with an annotated bibliography, a list of neural network software suites, a list of journals that publish ANN-related research, and a list of websites that make data sets publicly available.