ABSTRACT

After completing data collection and prior to beginning the formal regression analyses, it is incumbent on the investigator to screen their data. Typically, this process is called data cleaning in which the investigator examines each variable in the data set to assure that errors have not been made. Quantitative models have assumptions which first need to be met in order to generate valid conclusions about the population of interest. Similarly, regression analysis has its own set of assumptions: normality; linearity; homoscedasticity; multicollinearity; and independence. This chapter illustrates the syntax and resulting histogram for two of the variables (oxygen uptake and age). It also illustrates a fairly good spread of the standardized residuals plotted against standardized predicted values. Multicollinearity is assessed using bivariate correlations, tolerance, and the variance inflation factor (VIF). The Durbin-Watson statistic tests the assumption of independence of errors of prediction.