ABSTRACT

Women have consistently earned lower wages than men in US labor markets, although this gap has narrowed in recent decades (Blau, 1998). Understanding the sources of sex differences in wages is vital to determining why the wage gap between men and women persists. Previous research has focused on the impact of the occupational segregation ofmen andwomen on thewage gap (e.g. Macpherson and Hirsch, 1995), the effect of industry segregation (e.g. Fields andWolff, 1995), and, to a lesser extent, on the segregation ofmen andwomen into different employers (Blau, 1977; Bielby and Baron, 1984; Carrington and Troske, 1998). These studies all find evidence that the wage gap falls considerably after accounting for segregation. Evidence on the contribution to the wage gap of within-establishment, within-

occupation segregation is far harder to find. Indeed, we are not aware of any empirical work on this issue that uses large data sets representative of a wide array of industries. The reason for this is the paucity of data sets containing detailed demographic information for multiple workers in the same establishment. As a result, studies of the effects of establishment and occupation-establishment segregation have used unusual, quite narrow data sets. For example, the best-known study is by Groshen (1991), which uses surveys of wages for a subset of occupations in five specific industries included as part of the Bureau of Labor Statistics (BLS) Industry Wage Surveys (IWS). In earlier work, Blau (1977) used the BLS AreaWage Surveys to provide a decomposition of the sex gap in wages, including evidence on the importance of an individual’s sex within occupation, establishment, and job cell. Her data covered subsets of three broad occupations in three large northeastern cities. The focus in these studies on a handful of industries or occupations provides

something closer to a set of case studies, with the lack of representativeness limiting their usefulness in assessing the forces atwork in generating the sexwage gap in the United States. Our goal in this chapter is to use amuch broader andmore nationally representative data set to estimate the contributions of sex segregation by industry, occupation, and occupation-establishment cell ( job cell) to the sex wage gap. For

our analysis, we construct and use an extendedversion of theWorkerEstablishment Characteristics Database (WECD) to decompose the source of male-female wage differentials. Like the WECD, this data set uses the US Census Bureau’s Standard Statistical Establishment List (SSEL) to identify the employers of individuals who responded to the long form of the 1990 Decennial Census. However, whereas the WECD is limited to manufacturing plants, this new data set (the New WECD, or NWECD) includes workers and establishments from all sectors of the economy and all regions.1 Nonetheless, because of the constraints imposed by matching employees to employers, some nonrepresentative characteristics of the data set are unavoidable. Using the NWECD, we provide new estimates of the role of various dimen-

sions of sex segregation in generating sex differences in wages. Although in some respects our evidence may be viewed as complementary to that in the earlier studies, in our view the NWECD, while having some shortcomings, is clearly better suited to characterizing the effects of sex segregation in US labor markets. Our results indicate that a sizable fraction of the sex gap in wages is accounted for by the segregation of women into lower-paying occupations, industries, establishments, and occupations within establishments. We also find, however, that a very substantial part of the sex gap in wages remains attributable to the individual’s sex.

The data used in this study come from a match between worker records from the 1990SampleEditedDetail File (SEDF) to establishment records in the 1990SSEL. The 1990 SEDF consists of all household responses to the 1990 Decennial Census long form. As part of the Decennial Census, one-sixth of all households receive a “long-form” survey, which asks a number of questions about each member of the household (“person questions”) as well as about the housing unit (“housing questions”). Those receiving the long form are asked to identify each employed household member’s (1) occupation, (2) employer location, and (3) employer industry in the previous week. The Census Bureau then assigns occupational, industrial, and geographic codes to long-form responses. Thus, the SEDF contains the standard demographic information for workers collected on the long form of the Decennial Census, along with detailed location information and a three-digit Census industry code for each respondent’s place of work. The SSEL is an annual list of business establishments maintained by the US

Census Bureau. The SSEL contains detailed location information and a four-digit SIC code for each establishment, along with a unique establishment identifier that is common to other Census Bureau economic surveys and censuses. It also includes information on total payroll expenses, employment, and whether or not the establishment is part of a multi-establishment firm. Wematchedworkers and establishments using the detailed location and industry

information available in both data sets. We did this because we did not actually have the employer name available on both establishment and worker records. Briefly, the first step in the matching process was to keep only establishments

unique to an industry-location cell. Next, all workers indicating that they work in the same industry-location cell as a retained establishment were linked to the establishment. The matched data set is the NWECD. Because the SEDF contains only a sample of workers, and because not all workers are matched, the matched data set includes a sample of workers at each establishment. Complete details of the matching procedure are provided in the Appendix. In our matched sample, we impose some restrictions on both individuals and

establishments. We include only individuals who report usually working between 30 and 65 hours per week, and 30 or more weeks in the last year (1989). These restrictions on hours andweeks are intended to pick out full-time, full-yearworkers who are less likely to have changed jobs in the past year, as well as those whose hours are so high that theymay have heldmultiple jobs. Wemake these restrictions for three reasons. First, because the Decennial Census collects data on earnings from all jobs, rather than wages on the current job, we need to try to eliminate variation in wages that stems from multiple job-holding at a point in time or during the previous calendar year.2 Second, because the 1990 Decennial Census asks workers to report the address of the establishment where they worked in the previous week, while the earnings data are for the previous calendar year, job changing may lead to inaccurate measurement of earnings in the matched data. Imposing restrictions that get us closer to full-year, full-time workers should disproportionately eliminate workers who have changed jobs.3 Finally, the IWS data, with which we eventually are interested in drawing some comparisons, cover only full-time workers. We also restrict the sample to workers aged 18-65, with a constructed hourly wage ((annual earnings/weeks worked)/usual hours worked per week) in the range $2.50-$500, and exclude those working in establishments in public administration (in order to restrict our focus to the private sector). We also require that establishments have total employment of at least twenty-

five, for two reasons: first, whenwe compared average establishment-level worker earnings in the matched observations in the SEDF with average payroll expenses in the SSEL, these corresponded much more closely for establishments with twenty-five or more workers; second, the IWS industry samples included mainly establishments with twenty-five or more workers. In addition, to ensure that we have a reasonable basis for estimating the characteristics of an establishment’s workforce, we required that the number of matched workers be least 5 percent of employment as reported in the SSEL. Finally, we eliminated the less than 0.1 percent of establishments that reported earnings exceedingmore than $600,000 per worker. Table 12.1 documents the effects of these various matching rules and exclusion

restrictions on the sample size, the number of matched workers, and average earnings and employment calculated from both the SSEL and SEDF data. We define measures of establishment earnings per worker from data in both the SSEL and the SEDF. For the SSEL, earnings per worker are constructed as Total Annual Payroll/Total Employment. For the SEDF, establishment earnings per worker are created by averaging the annual wages and salaries of all workers matched to the establishment.4 The table shows that 7 percent of establishments can be assigned