ABSTRACT

Regression model building, as we have seen, can not only be straightforward,

but also tricky. Many times, if the researcher knows what variables are

important and of interest, little effort is needed. However, when a researcher

is exploring new areas or consulting for others, this is often not the case. In

these situations, it can be valuable to collect wide data concerning variables

thought to influence the outcome of the dependent variable, y. The entire process may be viewed as

1. Identifying independent predictor xi variables of interest 2. Collecting measurements on those xi variables related to the observed

measurements of the yi values 3. Selecting significant xi variables by statistical procedures, in terms of

increasing SSR and decreasing SSE 4. With the selected variables, validating the conditions under which the

model is adequate

It is not uncommon for researchers to collect data on more variables than are

practical for use in regression analysis. For example, in a laundry detergent

validation study for which the author recently consulted, two methods were

used-one for top-loading machines and another for front-loading machines.

The main difference between the machines was water volume. Several micro-

organism species were used in the study, against three concentrations of an

antimicrobial laundry soap. Testing was conducted by two teams of techni-

cians at each of six different laboratories over a five-day period. The number

of variables to answer the research question, ‘‘Do significant differences in

the data exist among the test laboratories,’’ was extreme.