ABSTRACT

This chapter considers the task of estimating a density from a sample of independent and identical observations. The bar chart represents frequencies of the values of a set of categorical variable in various categories as the heights of bars. Construct a bar plot for this categorical variable, again, with bars ordered according to the order of the intervals. Construction of a histogram, then, requires selection of bar width and bar starting point. One generally chooses the end points of intervals generating the bars to be round numbers. The choice of bandwidth balances effects of variance and bias of the density estimator, just as the choice of bin width did for the histogram. If the bandwidth is too high, the density estimate will be too smooth, and hide features of data. The excessively large bandwidth wipes out all detail from the data set. Note the large probability assigned negative arsenic concentrations by the estimate with the excessively large bandwidth.