ABSTRACT

This chapter explores how to do regression in complex sample surveys. It reviews the traditional model-based approach to regression analysis, as taught in introductory statistics courses. The chapter discusses a design-based approach to regression, and presents methods for calculating standard errors of regression coefficients. It explains design-based and model-based approaches, and also discusses a model-based approach. Many investigators performing regression analyses on complex survey data simply run the data through standard software for the model in and report the parameter estimates and standard errors given by the software. Many regression textbooks discuss regression estimation using weighted least squares as a remedy for unequal variances. The purpose of a regression analysis often differs from that of an analysis to estimate population means and totals. Many survey statisticians have debated whether the sampling weights are relevant for inference in regression. In linear regression, the response variable is usually considered to be approximately continuous—for example, birth weight, income, or leaf area.