In Chapter 6 we introduced mixture models as a statistical method for clustering. Mixture models allow us to model complex distributions as combinations of simpler ones. This is particularly useful for clustering, as data that we might wish to cluster is likely to have multiple modes and it is natural to imagine each mode (cluster) being modelled by a separate distribution. To introduce inference in mixture models, we used the EM algorithm, which produces point estimates of the parameters of each component, and the component membership probabilities of each data point.