ABSTRACT

Numeric distributional symbolic data is known as numerical modal data according to the symbolic data analysis definitions. This chapter contributes to principal component analysis method for numeric distributional data whose realizations can be histograms, empirical distributions or empirical estimates of parametric distributions. Regarding numeric distributional data as observed random variables with a probability density function, we present an exact probability density function for each principal component by using the inversion theorem and define a covariance matrix for numeric distributional data. Furthermore, a PCA method for distributional data called DPCA based on this variance-covariance structure is established, which also considers the link between covariance matrix and projection. The effectiveness of the DPCA 248method is illustrated by a simulated numerical experiment, and two real-life cases concerning evaluation of journals from Chinese Science Citation Database and innate structure of China's stock market.