ABSTRACT

Panel data refers to data sets consisting of multiple observations on each sampling unit. This could be generated by pooling time-series observations across a variety of cross-sectional units including countries, states, regions, firms, or randomly sampled individuals or households. Two well-known examples in the United States are the Panel Study of Income Dynamics (PSID) and the National Longitudinal Survey (NLS). The PSID began in 1968 with 4802 families, including an oversampling of poor households. Annual interviews were conducted and socioeconomic characteristics of each of the families and of roughly 31000 individuals who have been in these or derivative families were recorded. The list of variables collected is over 5000. The NLS followed five distinct segments of the labor force. The original samples include 5020 older men, 5225 young men, 5083 mature women, 5159 young women, and 12686 youths. There was an oversampling of blacks, hispanics, poor whites, and military in the youths survey. The list of variables collected runs into the thousands. Panel data sets have also been constructed from the U.S. Current Population Survey (CPS), which is a monthly national household survey conducted by the Census Bureau. The CPS generates the unemployment rate and other labor force statistics. Compared with the NLS and PSID data sets, the CPS contains fewer variables, spans a shorter period, and does not follow movers. However, it covers a much larger sample and is representative of all demographic groups. European panel data sets include the German Social l'~conomic Panel, the Swedish study of household market and non market activities, and the Intomart Dutch panel of households.