ABSTRACT

Data visualization, using attractive and effective figures to display all data points to allow readers to see the overall data patterns, study the inherent relationships, and detect outliers has been an interesting yet challenging job for statisticians (Cleveland, 1985; Tufte, 1983, 1997, 2006). Many different types of plots have been introduced and produced, including those especially named for their shapes, like the bag, bar, bike, box, bubble, pie, pyramid, spaghetti, rainbow, violin, and waterfall plots (Allison, 2012; Hyndman and Shang, 2010). This chapter introduces and illustrates how to produce a new type of plot known for its shape, the thunderstorm or raindrop scatter plot. This plot allows the viewing of data with two or more values on the y-axis corresponding to one value on the x-axis for each of several subjects in a population. The resulting plot looks like raindrops, with each raindrop representing data for a single subject. When data for many subjects are plotted, it resembles a thunderstorm, hence the name. A thunderstorm/raindrop scatter plot is a useful tool for data visualization and outlier detection. The concept of thunderstorm/raindrop scatter plots is introduced and sample figures are produced using the SAS/GRAPH annotate facility in PROC GPLOT and the HIGHLOW statement in PROC SGPLOT.