ABSTRACT

There are two main reasons for beginning our s tudy of kernel smoothing w i t h the univariate kernel density estimator. T h e first is that nonparametric density est imation is an important data analyt ic tool which provides a very effective way of showing structure i n a set of data at the beginning of its analysis. T h i s was demonstrated i n Sect ion 1.2. It is especially effective when standard parametric models are inappropriate. T h i s is i l lustrated in Figures 2.1 (a) and 2.1 (b) which show density estimates based on da ta corresponding to the incomes of 7,201 B r i t i s h households for the year 1975 (see e.g. P a r k and M a r r o n , 1990). T h e da ta have been divided by the sample average. F igure 2.1 (a) is a parametric density estimate based on model l ing income by the lognormal family of dis tr ibut ions which have densities of the form

f{x\eu02) = </>{Qnx - 81)/62}/{92x), x > 0,

where <f>{x) = (2n)-^2e'x2^2, - o o < 0 i < oo and 62 > 0. T h e values of 9\ and 62 were chosen by m a x i m u m l ikel ihood. F igure 2.1 (b) is a nonparametric kernel estimate based on a version of the transformation kernel density estimator as described i n Section 2.10. T h e ma in lesson is that the interesting b imoda l structure (which has important economic significance) that is uncovered by the kernel estimator is completely missed by impos ing a un imoda l parametric model such as the lognormal. Since one of the m a i n aims of da ta analysis is to highlight important structure in the da ta it is desirable to have a density estimator that does not assume that the density has a part icular functional form.