ABSTRACT

This chapter will illustrate the steps to follow in the regression or partial leastsquares analysis of a data set. The discussion starts with preparing the data for the analysis, covers the straightforward initial stages of both linear and nonlinear methods, and finally moves on to the more complex tricks of the trade and pitfalls of the methods. There is no standard protocol for using these methods.[1-3]

One outcome of quantitative structure-activity relationship (QSAR) analyses is the suggestion as to which molecular properties are related to potency; in automated methods this exploration is called variable selection. The first phase of variable selection is accomplished once the user prepares the input data because often no additional properties will be considered. The subsequent statistical analyses determine the form of, and quantitative relationships (if any) between, these input properties and potency. Thus, variable selection is the cornerstone of all QSAR methods, even though it is more often discussed for methods that automatically examine large data sets.