A Tale of Two (Types Of) Memberships
Mixed membership models such as the Grade of Membership and latent Dirichlet allocation models have primarily focused on the analysis of binary and categorical data. In this chapter, we will focus on exploring the performance of two different types of membership models with continuous data: one that has a classic mixed membership structure and one that has a partial membership structure. The Bayesian partial membership model was recently proposed by Heller et al. (2008) as a promising alternative to mixed membership motivated by continuous data. The Bayesian partial membership model based on exponential family distributions allows for computationally efficient modeling of a variety of data types. Heller et al. (2008) demonstrated a partial membership analysis of a discrete dataset. In this work, we use a dataset that has a collection of continuous variables describing NBA (National Basketball Association) players and their playing styles as a motivating example. Although NBA players are typically assigned to one of five player positions, the language used to describe players and playing styles is often suggestive of individual-level mixtures. In this chapter, we compare the exponential family form of the Bayesian partial membership model with the general mixed membership model on simulated binary and continuous data. We then extend the partial membership framework to account for correlated membership scores. Based on the proper-
Membership Models and
ties of the two types of models and the nature of the NBA data, we argue for choosing a partial membership model over a mixed membership model in this case. We show how the NBA players can be modeled as individual-level mixtures using the correlated partial membership model. To our knowledge, this is the first individual-level mixture analysis of continuous data.