ABSTRACT

This chapter aims to present some of the overarching important concepts of machine learning, and to see how some of the basic ideas of data processing and statistics arise in machine learning. Machine learning algorithms all work by taking a set of input values, producing an output for that input vector, and then moving on to the next input. The purpose of learning is to get better at predicting the outputs, be they class labels or continuous regression values. There is a different way to evaluate the accuracy of a learning system, which unfortunately also uses the word precision, although with a different meaning. The concept is to treat the machine learning algorithm as a measurement system. The chapter considers the endpoint of machine learning, looking at the outputs, and thinking about what we need to do with the input data in terms of having multiple datasets. It provides a quick summary of a few important statistical concepts.