ABSTRACT

This chapter covers three important aspects of linear algebra for data science: the storage of data in matrices, matrix decompositions, and eigenproblems. It discusses how matrices and vectors are used to store data in data science contexts. Linear algebra techniques and perspectives are fundamental in many data science techniques. Most programming languages will support declaring a variable to be a scalar, a one-dimensional array, or a two-dimensional array. There are several standard ways to factor matrices that occur in applications, including but by no means limited to data science. It's commonplace among people working with data that much more time is spent on managing the data than on analyzing it. This has certainly been the author's experience. When assigning projects related to this material in general computational science and in computational data science courses, the first instruction to the students is to assemble a dataset.