Multivariate Data and Their Analysis | 13 | A Whistle-Stop Tour of Sta

ABSTRACT

Multivariate data: Data resulting from the measurement of the values of several random variables on each individual in a sample, leading to a vectorvalued or multidimensional observation for each. An example of a multivariate data set is shown in Table 10.1

The majority of data sets collected by researchers in all disciplines are multivariate. Although in some cases where multivariate data have been collected it may make sense to isolate each variable and study it separately, in the main it does not. Because the whole set of variables is measured on each sample member, the variables will be related to a greater or lesser degree. Consequently, if each variable is analyzed in isolation, the full structure of the data may not be revealed. With the great majority of multivariate data sets all the variables need to be examined simultaneously in order to uncover the patterns and key features in the data and hence the need for techniques which allow for a multivariate analysis of the data. The multivariate data matrix: An n × p matrix X. used to represent a set of multivariate data for n individuals each having p variable values:

X = 

 



 

x x

where xij represents the variable value on the jth variable for the ith individual.

Mean vectors: For p variables, the population mean vector is μ′ = [μ1, μ2,…μp], where μi = E(xi ) An estimate of μ′, based on n, p-dimensional observations, is