Large-Scale Machine Learning with Stochastic Gradient Descent Léon Bottou

doi:10.1201/b11429-6

ABSTRACT

The computational complexity of learning an algorithm becomes the critical limiting factor when one envisions very large datasets. This contribution advocates stochastic gradient algorithms for large-scale machine learning problems. The ﬁrst section describes the stochastic gradient algorithm. The second section presents an analysis that explains why stochastic gradient algorithms are attractive when the data is abundant. The third section discusses the asymptotical eﬃciency of estimates obtained after a single pass over the training set. The last section presents empirical evidence.