chapter  12
36 Pages

The Why and How of Nonnegative Matrix Factorization

WithNicolas Gillis

Nicolas Gillis∗ Department of Mathematics and Operational Research, Faculté Polytechnique, Université de Mons

12.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 12.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 12.3 The Why — NMF Generates Sparse and Meaningful Features . 259

12.3.1 Image Processing — Facial Feature Extraction . . . . . . . . . 260 12.3.2 Text Mining — Topic Recovery and Document

Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 12.3.3 Hyperspectral Unmixing — Identify Endmembers and

Classify Pixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 12.4 The How — Some Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

12.4.1 Standard NMF Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 12.4.1.1 First-Order Optimality Conditions . . . . . . . . 266 12.4.1.2 Multiplicative Updates . . . . . . . . . . . . . . . . . . . . . 266 12.4.1.3 Alternating Least Squares . . . . . . . . . . . . . . . . . 268 12.4.1.4 Alternating Nonnegative Least Squares . . . 268 12.4.1.5 Hierarchical Alternating Least Squares . . . . 269 12.4.1.6 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 12.4.1.7 Stopping Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 270 12.4.1.8 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

12.4.2 Near-Separable NMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 12.4.2.1 Self-Dictionary and Sparse Regression

Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 12.4.2.2 Geometric Algorithms . . . . . . . . . . . . . . . . . . . . . 276

12.5 Connections with Problems in Mathematics and Computer Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

12.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

Vector

12.1 Summary Nonnegative matrix factorization (NMF) has become a widely used tool for

the analysis of high-dimensional data as it automatically extracts sparse and meaningful features from a set of nonnegative data vectors. We first illustrate this property of NMF on three applications, in image processing, text mining, and hyperspectral imaging — this is the why. Then we address the problem of solving NMF, which is NP-hard in general. We review some standard NMF algorithms, and also present a recent subclass of NMF problems, referred to as near-separable NMF, that can be solved efficiently (that is, in polynomial time), even in the presence of noise — this is the how. Finally, we briefly describe some problems in mathematics and computer science closely related to NMF via the nonnegative rank.