ABSTRACT

This chapter begins with a discussion of a theoretical framework for data visualization known as “the grammar of graphics.” The simplest of the five named graphs are scatterplots, also called bivariate plots. They allow learner to visualize the relationship between two numerical variables. The first way of addressing overplotting is to change the transparency/opacity of the points by setting the alpha argument in geom_point(). The second way of addressing overplotting is by jittering all the points. This means giving each point a small “nudge” in a random direction. In the left-hand regular scatterplot, observe that the 4 points are superimposed on top of each other. Scatterplots display the relationship between two numerical variables. They are among the most commonly used plots because they can provide an immediate way to see the trend in one numerical variable versus another.