ABSTRACT

In recent years large-scale cross-section data sets containing several hundred and even thousands of observations on the characteristics of individuals, firms and even towns/cities/regions have become available. The availability of this type of data, and software advances, has made it possible for practitioners to perform empirical investigations in the fields of social science, economics and marketing. A distinguishing feature of this type of investigation involving regression models is that the dependent variable is specified as a qualitative variable, representing the response of an individual, or firm, to particular questions. For example, in an investigation on the incidence of R&D activities by firms, a firm is either undertaking R&D activities, or not, given its characteristics (e.g. sales, exports, etc.) The dependent variable, the R&D status of the firm, is dichotomous, representing the response of the firm to the question: are you undertaking R&D activities? The response is either positive or negative. The dependent variable is therefore a categorical variable, represented by a dummy variable taking two values: if the answer is positive, it takes a value of one, otherwise it takes a value of zero. Alternatively, in a study of incidence of private health insurance, using a large sample of cross-section of data on the characteristics of individuals, an individual either has private health insurance, or not, given his or her characteristics (e.g. income, occupation, age, etc.). The question being asked of each individual is: do you have private health insurance? The response is either positive or negative. If it is found to be positive, the dependent variable, the insurance ownership status of an individual, takes a value of one, otherwise it takes a value of zero. In these examples, the dependent variable is defined on the characteristics of individuals/firms. It is a dichotomous qualitative variable eliciting a ‘yes’ or a ‘no’ response. Qualitative response models lead to a number of interesting problems concerning estimation, interpretation and analysis. In this introductory chapter we deal with binary qualitative dependent variable regression models, as well as multinomial and ordered logit regression models. Key topics

The linear probability model (LPM)

The logit and probit models

The multinomial logit model

The ordered logit/probit models