ABSTRACT

The principal components constructed in Principal components analysis (PCA) are linear in nature, which can cause deficiencies in its performance. A generalization of PCA and matrix factorization, called generalized low rank models, has become a popular approach to dimension reduction. A more sophisticated use of generalized low rank models (GLRMs) is to create a model where the reduced archetypes will be used on future, unseen data. When applying a GLRM model to unseen data, using a regularizer can help to reduce overfitting and help the model generalize better. GLRM models behave much like supervised models where there are several hyperparameters that can be tuned to optimize performance. GLRMs are an extension of the well-known matrix factorization methods such as PCA. While PCA is limited to numeric data, GLRMs can handle mixed numeric, categorical, ordinal, and boolean data with an arbitrary number of missing values.