ABSTRACT

This chapter is primarily a refresher on the biostatistical techniques most applicable to EHR-based research. To begin, the chapter introduces an analytic dataset that serves as a practical example of the methods presented in this chapter and the next. The analysis starts by describing the variables in this dataset and verifying assumptions implicit in hypothesis testing, such as the distributions of the data. Next in the chapter is a presentation of biostatistical techniques for quantifying crude, unadjusted bivariable associations, including the t-test, analysis of variance, rank-sum test, and chi-squared test. Following this, the basics of regression techniques are described, focusing on four types of models: linear, logistic, Poisson, and Cox proportional hazards (survival analysis). The chapter concludes with a discussion of sensitivity analyses and missing data methods as applicable to EHR data. Throughout the chapter, examples are tied to theory using the analytic dataset.