ABSTRACT

This chapter discusses strategies and tools for characterizing the relationship between variables. It first describes dummy variables, which are 0/1 binary variables that describe a yes/no feature or a categorization of a factor. There is an emphasis on being careful with defining the proper reference group (for multiple categories) and with interpreting the coefficient estimates on dummy variables with regards to the reference group. The chapter then presents a few methods of estimating non-linear relationships, including the use of interactions of variables, spline functions, quadratic (polynomial) functions, and logarithmic functions. In addition, this chapter describes the concept of using weighted regression models (Weighted Least Squares) so observations that represent a larger part of a population have a more appropriate weight in a model.