ABSTRACT

Regression analysis estimates or predicts the scores of one variable (called the criterion or the dependent variable) from one or more other variables (called predictors or independent variables). In order to predict the criterion, the criterion is related to or regressed on the predictor(s). Simple or bivariate regression (Galton 1886) involves one predictor whereas multiple regression uses two or more predictors. One of the main purposes of multiple regression in the social sciences is not so much to predict the score of one variable from others but to determine the minimum number of a set of variables which is most strongly related to the criterion and to estimate the percentage of variance in the criterion explained by those variables. For example, we may be interested in finding out which variables are most strongly related to coursework marks and how much of the variance in the marks these variables explain. Generally, the variable which is most highly related to the marks is entered first into the regression equation followed by variables which are the next most strongly related to the marks once their relationship with the other variables is taken into account. If later variables are strongly associated with the variables already entered, then it is less likely that they will independently account for much more of the variance than those previously entered and so they are unlikely to be included as predictors. Although we will demonstrate the calculation of multiple regression with a few cases, this technique should only be used when a relatively large number of cases is available. Under these circumstances, multiple regression is a very valuable statistical procedure. We will begin by describing bivariate regression which just involves one predictor.