ABSTRACT

Collinearity is a phenomenon where one explanatory variable in a multiple regression model is highly correlated with another. Observe in the right-hand plot that the relationship between debt and income is clearly negative for the “medium-low” and “medium-high” credit_limit brackets, while the relationship is somewhat flat for the “low” credit_limit bracket. The only credit_limit bracket where the relationship remains positive is for the “high” credit_limit bracket. Let’s quantify the relationship of our outcome variable and the two explanatory variables using one type of multiple regression model known as an interaction model. Let’s perform the last of the three common steps in an exploratory data analysis: creating data visualizations. Since our regression models will consider more than one explanatory variable, the interpretation of the associated effect of any one explanatory variable must be made in conjunction with the other explanatory variables included in learner model.