ABSTRACT

The generalization of the linear regression model to histogram data is not a straightforward extension of the classical model. The semi-linearity of the space where the elements are histograms implies that the parameters in the model cannot be negative. However, a linear relation between histogram-valued variables should be allowed to be either direct or inverse. The Distribution and Symmetric Distribution Model solves this problem and allows predicting distributions directly from other distributions. To determine the parameters of the model it is necessary to solve a quadratic optimization problem, subject to non-negativity constraints on the unknowns. It uses the Mallows Distance to evaluate the dissimilarity between the observed and predicted distributions. As in classical regression, a goodness-of-fit measure is deduced from the model, whose values range between 0 and 1.