Statistical modeling | 10 | Algebraic Statistics

ABSTRACT

This chapter is devoted to the statistical applications of the polynomial encoding of probabilities on a sample space D described as a zero-dimensional variety. We bring together the results on algebraic modeling of Chapter 2 with the treatment of probability theory of Chapter 5 to discuss statistical modeling and analysis. It will appear that the polynomial algebraic description is well suited

to describe operations on statistics (random variables) and on the algebra of parameters, especially in particular cases, such as the lattice case. We must underline the two applications, namely computation with random variables versus computations with parameters; the latter being of a quite diﬀerent complexity. In fact, in the ﬁrst case we work on ideals of points, while in the second case we have to deal with generic algebraic varieties. The computational load in the second case is much higher, and most of the general algorithms available at the moment of writing this book are unable to really deal with the symbolic solution of larger problems, a typical case being the symbolic explicit solution of the maximum likelihood equations. The solution we suggest consists in adopting a hybrid approach: use the

algebraic approach where it works very well, for example in the description of the structure of models based on conditional independence assumptions, and switch to a numerical approach when the symbolic solution is computationally infeasible, as in the solution of some maximum likelihood equations. Many research ﬁelds that use a computational commutative algebra approach meet the same problem nowadays, and there is an important ongoing research eﬀort to develop eﬃcient hybrid algorithms. As far as this book is concerned, we restrict ourselves to these generic remarks and will not discuss the matter further. After a description of how all basic problems of modeling and estima-

tion are encoded in a polynomial way, we make, in Section 6.2, a systematic review of how basic results on statistical modeling and estimation are translated in our framework. We close the present Chapter (and the book), with a long, detailed ex-

ample (see 6.9 below) on the treatment of a special graphical model (see Figure 6.1), which leads us into a brief glimpse of toric ideals. This example shows the evidence for our overall conclusion as discussed above. Namely,

is most computational method where the theory of designs is concerned. It supplies interesting, albeit mainly conceptual, tools for the study of statistical models. The eﬀectiveness of the algebraic methods in statistical models relies either on special cases, such that the lattice case treated here, or on a drastic improvement of the eﬃciency of computer algebra algorithms. In the ﬁnal Section 6.10, we make a number of concise statements pertaining to the structure of maximum likelihood equations in the case of a lattice sample space. In this chapter, D is a design with N points, τ a term-ordering and

Est = {xα : α ∈ L} the corresponding saturated set of identiﬁable terms. With L′ or M we indicate a subset of L, L0 = L \ {0}, M0 =M \ {0} and L′0 ⊆ L \ {0}.