ABSTRACT

This chapter describes the general structure for data. It introduces the data frame, a class of objects to represent the data typically encountered in fitting models. The chapter discusses the statistical models nearly always think of the underlying observational data as being organized by variables—statistical abstractions for different things that can be observed. It describes most of the models proceed by creating a numeric matrix that describes all the terms included in the model. These objects belong to the "model.matrix" class, and can be generated by calling model matrix. These matrices encode appropriately all terms in a model to produce a numeric matrix suitable for fitting. Plots and numerical summaries play a critical role in statistical modeling. Numerical summaries provide an incisive, although quite limited, quantification of aspects of the data such as the variation of measurements of a single variable or the degree of correlation between measurements of variables.