ABSTRACT

The ordinary least squares (OLS) model is the most known method for modeling a causal relationship between a dependent variable and a set of independent ones. The chapter presents a (population) OLS model for data described by numeric distributional variables. Indeed, well-grounded and accepted inference on distributional variables is not yet available. The model is based on a particular decomposition of the 2-norm Wasserstein distance allowing the definition of a model of two components. The two components are related to the internal and between data variability inherent to distributional-valued data. An application on a climatic dataset shows the usefulness of the approach. An analysis of residuals supports the application and suggests some implications for further developments.