ABSTRACT

The simple xy scatter plot has certainly been in use for a long time-at least from the eighteenth century, and it has many virtues, indeed, according to Tufte (1983):

Now let’s have a look at an example of a scatter plot. For this we will use the data shown in Table 7.1, which were collected in a study investigating the possible link between alcohol consumption and the death rate per 100,000 of the population from cirrhosis and alcoholism (data collected before West Germany ceased to exist as a separate country). A scatter plot of the data that includes appropriate labels for each bivariate observation can be constructed using the following SAS code, which also produces the value of Pearson’s correlation coefficient for the two variables (see later for the definition of this coefficient):

Some of the country names are longer than the default of eight for character variables, so column input is used to read them in. The values of the two numeric variables can then be read in with list input. This is an example of

Pearson correlations by default. A var statement would normally be used as the default is to include all numeric variables.