ABSTRACT

Machine learning, a sub-field of artificial intelligence, is the study of computer algorithms that improve automatically through experience. Although the term was coined in 1959, machine learning builds on questions/methods that were developed earlier in linear algebra, mathematical analysis, optimization, and mathematical statistics. In this chapter we will give a brief overview of machine learning. First a few basic regression and classification techniques are introduced with examples. Then we introduce a few key concepts, such as Bayes theorem and prior and posterior distributions. Some machine-learning algorithms are based on estimating the likelihood function and on using Bayes theorem to obtain the posterior distribution, but most algorithms can be viewed as trying to model the posterior distribution directly. This leads to linear logistic regressions, the perceptron, shallow neural networks, artificial neural networks and, finally, convolutional neural networks. Machine-learning methods involve estimation of parameters, often using optimization. Determining what method to use and determining the so-called hyper-parameters require additional consideration in terms of over- and under-fitting and the need to divide data into three parts denoted as training, validation, and test. The progress within machine learning has been swift during the last decade.

The chapter ends with a few examples of computational architectures for solving different types of machine-learning problems and also some dimensionality-reduction techniques.