ABSTRACT

Neural networks are a broad class of predictive models that have enjoyed considerably popularity. Neural networks consist of a collection of objects, known as neurons, organized into an ordered set of layers. Neural networks failed to become a general purpose learning technique until the early-2000s, due to the fact that they require large datasets and extensive computing power. This chapter aims to de-mystify neural networks by showing that they can be understood as a natural extension of linear models. Neural networks generally need many thousands of iterations to converge to a reasonable minimizer of the loss function. The stochastic gradient descent algorithm actually does the updates in an iterative fashion, but makes one small modification. Training a neural network involves updating the parameters describing the connections in order to minimize some loss function over the training data. Typically, the number of neurons and their connections are held fixed during this process.