Breadcrumbs Section. Click here to navigate to respective pages.

Chapter

Chapter

# Logistic Regression and Discriminant Analysis

DOI link for Logistic Regression and Discriminant Analysis

Logistic Regression and Discriminant Analysis book

# Logistic Regression and Discriminant Analysis

DOI link for Logistic Regression and Discriminant Analysis

Logistic Regression and Discriminant Analysis book

## ABSTRACT

Logistic regression and discriminant analysis, like multiple regression, are useful when you want to predict an outcome or dependent variable from a set of predictor variables. They are similar to a linear regression in many ways. However, logistic regression and discriminant analysis are more appropriate when the dependent variable is categorical. Logistic regression is useful because it does not rely on some of the assumptions on which multiple regression and discriminant analysis are based. As with other forms of regression, multicollinearity (high correlations among the predictors) can lead to problems for both logistic and discriminant analysis. Logistic regression is helpful when you want to predict a categorical variable from a set of predictor variables. It is useful when some or all of the independent variables are dichotomous; others can be continuous. Binary logistic regression is similar to linear regression except that it is used when the dependent variable is dichotomous. Multinomial logistic regression is used when the dependent/outcome variable has more than two categories, but it is beyond the scope of this chapter. Discriminant analysis, on the other hand, is most useful when you have several continuous independent variables and, as in logistic regression, an outcome or dependent variable that is categorical. The dependent variable can have more than two categories. If so, then more than one discriminant function will be generated (number of functions = number of levels of the dependent variable minus 1). For the sake of simplicity, we will limit our discussion to the case of a dichotomous dependent variable. Discriminant analysis is useful when you want to build a predictive model of group membership based on several observed characteristics of each participant. Discriminant analysis creates a linear combination of the predictor variables that provides the best discrimination between the groups. Conditions of Logistic Regression Conditions for binary logistic regression include that the dependent or outcome variable needs to be dichotomous and, like most other statistics, that the outcomes are mutually exclusive; that is, a single case can be represented only once and must be in one group or the other. Finally, logistic regression requires large samples to be accurate. Some say there should be a minimum of 20 cases per predictor, with a minimum of 60 total cases. These requirements need to be satisfied prior to doing statistical analysis with SPSS. As with multiple regression, multicollinearity is a potential source of confusing or misleading results and needs to be assessed (see Problem 8.3). Assumptions of Logistic Regression Logistic regression, unlike multiple regression and discriminant analysis, has very few assumptions, which is one reason this technique has become popular, especially in health-related fields. There are no distributional assumptions; however, observations must be independent and independent variables must be linearly related to the logit (natural log of the odds ratio) of the dependent variable. Conditions of Discriminant Analysis For accurate results, it is important that the sample size of the smallest group (35 in Problem 8.3) exceed the total number of predictor variables in the model (there are four in Problem 8.3, so this assumption is met). Multicollinearity is again an issue with which you need to be concerned. To test for multicollinearity, you can conduct a multiple regression analysis with the variables and request the Collinearity diagnostics (see Problem 8.3).