ABSTRACT

In this chapter, methods for density estimation are presented. It is discussed how to estimate an almost arbitrary function with the help of a set of observation from a distribution. First, it is shown that there is no unbiased density estimator. Then, the trade-off of approximation, estimation of a function, and histograms are introduced. The kernel density estimator is defined, and the asymptotic expansions of its bias and variance are derived. The asymptotic optimal bandwidth is calculated. The bias-variance tradeoff is discussed and illustrated. The theoretical background of following bandwidth selection rules are given: thumb rule, “jS”, cross-validation., The nearest neighbor estimators and the orthogonal series estimators are also mentioned. The last section gives some theoretical insights into the optimal minimax convergence rates.

The underlying theoretical background of the various methods are explained. All methods are demonstrated with a series of examples and plots. The Old Faithful data set is used for multiple problems. Some of the most important R codes are given. The chapter ends with a short list of problems useful for written exams.