ABSTRACT

This chapter explores more advanced statistical estimators of the density function. It deals with histograms, as they were the first non-parametric density estimators. The chapter details kernel density estimators as improvements over histograms and begins to tackle the choice of the bandwidth matrix, by offering practical insights into the choice amongst the most common data-based algorithms. It describes a rigorous mathematical framework for optimal bandwidth selection. Kernel smoothing plays an important role for low-to-moderate dimensional data. Examples include spatial analysis for microscopy images and the analysis of geographical trends/clusters/filaments, where the data is inherently low dimensional, and where increasing the number of dimensions does not necessarily lead to better statistical analysis since it implies that visualisation is then no longer easy to interpret. Indeed, the visual component of kernel smoothing for exploratory data analysis is crucial for many non-mathematician users.