ABSTRACT

In this chapter, we demonstrate some key programming and statistical techniques that statisticians may encounter in daily practice.

One of the most common needs in analytic practice is to replicate analyses for subgroups within the data. For example, one may need to stratify a linear regression by gender or repeat a modeling exercise multiple times for each replicate in a simulation experiment. The basic tools for replication in base R include the by() function and the apply() family of functions (A.5.2). The syntax for these functions can be complicated, however, and various packages exist that can replicate and enhance the functionality provided by apply(). One of these is the dplyr package developed by Hadley Wickham, demonstrated below; another is the doBy package.