Correlation and Regression | 13 | v5 | IBM SPSS for Introductory Stati

ABSTRACT

In this chapter, you will learn how to compute several associational statistics, after you learn how to make scatterplots and how to interpret them. An assumption of the Pearson product moment correlation is that the variables are related in a linear (straight line) way so we will examine scatterplots to see if that assumption is reasonable. Second, the Pearson correlation and the Spearman rho will be computed. The Pearson correlation is used when you have two variables that are normal/scale, and the Spearman is used when one or both of the variables are ordinal. Third, you will compute a correlation matrix indicating the associations among all the pairs of three or more variables. Fourth, you will compute simple or bivariate regression, which is used when one wants to predict scores on a normal/scale dependent (outcome) variable from one normal or scale independent (predictor) variable. Last, we will provide an introduction to a complex associational statistic, multiple regression, which is used to predict a scale/normal dependent variable from two or more independent variables. As stated in Chapter 7, correlations can vary from –1.00 (a perfect negative correlation or association) through .00 (no correlation) to +1.0 (a perfect positive correlation). Note that +1 and –1 are equally high or strong, but they lead to different interpretations. A high positive correlation between aptitude and grades would mean that students with higher aptitude tended to have higher grades, those with lower anxiety had lower grades, and those in between had grades that were neither especially high nor especially low. A high negative correlation would mean that students with high aptitude tended to have low grades; also, high grades would be associated with low aptitude. With a zero correlation there are no consistent associations. A student with high aptitude might have low, medium, or high grades. Assumptions and Conditions for the Pearson Correlation (r) and Bivariate Regression 1. The two variables have a linear relationship. We will show how to check this assumption

with a scatterplot in Problem 9.1. (Pearson r will not detect a curvilinear relationship unless you transform the variables, which is beyond the scope of this book.)

2. Scores on one variable are normally distributed for each value of the other variable and vice versa. If degrees of freedom are greater than 25, failure to meet this assumption has little consequence. Statistics designed for normally distributed data are called parametric statistics. Remember that we checked variables for skewness and normality in Chapter 4.