ABSTRACT

This chapter covers the basic variable and data management techniques to turn the electronic health record (EHR)-derived data into a polished dataset ready for statistical analyses. It deals with a crash course in core methods and approaches to data management. This chapter is divided into sections according to the manipulation of variables, observations, and the final dataset. Each section is independent of all other sections, and therefore, only certain sections may apply to specific situations; however, the reader is encouraged to become familiar with all methods covered. The methods provided are not meant to be entirely exhaustive to all aspects of data management, but rather provide a generic overview of common operations that can readily be adapted to the researcher's specific needs. If the data are still in a comma-separated value (CSV) file, they should be imported to the preferred statistical platform at this time using the software's import (or read) function.