ABSTRACT

Yi = x′iβ + Ui, i = 1, . . . , n (4.1)

with observations Y1, . . . , Yn, unknown and unobservable parameter β ∈ Rp, where xi ∈ Rp, i = 1, . . . , n are either given deterministic vectors or observable random vectors (regressors) and U1, . . . , Un are independent errors with a joint distribution function F. Often we consider the model in which the first component β1 of β is an intercept: it means that xi1 = 1, i = 1, . . . , n. Distribution function F is generally unknown; we only assume that it belongs to some family F of distribution functions. Denoting

Y = (Y1, . . . , Yn)′

X = Xn =

⎛⎜⎝ x ′ 1

... x′n

⎞⎟⎠ U = (U1, . . . , Un)′

we can rewrite (4.1) in the matrix form

Y = Xβ + U (4.2)

The most popular estimator of β is the classical least squares estimator (LSE) β̂. If X is of rank p, then β̂ is equal to

β̂ = (X ′X)−1X ′Y (4.3)

As it follows from the Gauss-Markov theorem, β̂ is the best linear unbiased estimator of β, provided the errors U1, . . . , Un have a finite second moment. Moreover, β̂ is the maximum likelihood estimator of β if U1, . . . , Un are normally distributed.