ABSTRACT

Categories matter a great deal in empirical analysis. The reason is that the average level of a numerical variable or relations between variables may differ quite markedly across different categories. For example, rural or urban location affects consumption and production patterns of households. Similarly, wage and salary earnings may differ between men and women, even for the same level of education and years of experience. Occupational status affects both the health (for example, mortality rates) and the wealth of people. Categorical variables allow us to classify our data into a set of mutually exclusive categories with respect to some qualitative criterion: for example, men/women; rural/urban; occupation; region, countries or continents; policy regimes. In practice, this type of variable is inevitably discrete in nature inasmuch as we only consider a definite (usually limited) number of categories. For example, the gender variable has only two categories (male/female), while a variable on occupational status usually distinguishes among eight to ten categories. The distinctive nature of these variables, therefore, is that they do not measure anything, but assign a quality to our data (i.e. they are qualitative not quantitative). Hence, we cannot compute an average for this type of variable, but we can count (frequencies) how many observations in our data set fall in a group defined by a qualitative categorical variable. This chapter deals with ways in which these variables can be employed in empirical analysis to look deeper into the structure of our data. In particular, this chapter shows that the use of categorical variables helps us to guard against making unwarranted generalisations based on the assumption that homogeneity prevails when, in fact, we are lumping things together which should be kept separate. We have already come across categorical variables in Chapter 6, when we introduced dummy variables. In this chapter, we take this analysis further and look at the categories behind the dummies.