ABSTRACT

This chapter extends the matrix formulations of classical linear regression involving n≥2 observations in m=2 unknowns to the general least squares problem with n≥m>2. Accordingly, the simultaneous equation system is represented by the n-row observation vector and the n-row by m-column design matrix that can be combined to obtain the m-row solution vector by methods like Cramer's Rule, Gaussian and Gauss-Jordan elimination, and Cholesky factorization. These matrices also obtain elegant estimates of the four basic measures of fit. A fifth measure based on the ratio of maximum to minimum eigenvalues of the system is the condition number that reflects the solution's utility for predicting anything more than simply the observations. Methods to improve ineffective poorly conditioned solutions include ridge regression with noise-contaminated design matrix elements and the generalized inverse based on the eigenvalue and eigenvector decomposition of the system. The latter singular value approach considers the information density and resolution matrices to identify the subsets of observations and unknowns respectively that are most linearly independent in the context of the presumed forward model. It also involves much greater computational labor to implement, but both approaches offer significant advantages for improving the utility of the poorly performing solution.