ABSTRACT

This chapter covers probability, statistics, and machine learning applications to environmental monitoring. Starting with basic probability theory, it formulates the basics of machine learning classification, interpreted from the point of view of Bayes’ theorem, emphasizing the concepts of false negative errors, false positive errors, and confusion matrix. Discrete random variables are introduced to support the analysis of counts and proportions, as well as contingency analysis. Analysis of error or confusion matrix is covered using Kappa statistics and contingency tables. Then, this chapter introduces multiple linear regression, including approaches to select the explanatory variables, emphasizing collinearity issues and stepwise regression. This chapter ends with an introduction to classification and regression trees and its application to supervised classification.