ABSTRACT

A histogram is a way to graphically summarize or describe a data set by visually conveying its distribution using vertical bars. They are easy to create and are computationally feasible, so they can be applied to massive data sets. A frequency histogram is obtained by first creating a set of bins or intervals that cover the range of the data set. It is important that these bins do not overlap and that they have equal width. Relative frequency histograms are obtained by mapping the height of the bin to the relative frequency of the observations that fall into the bin. A density histogram is a histogram that has been normalized so the area under the curve is one. Boxplots have been in use for many years. They are an excellent way to visualize summary statistics such as the median, to study the distribution of the data, and to supplement multivariate displays with univariate information.