ABSTRACT

In order to perform the same checks on all the data consistently throughout the course of the study, data management groups create a list of checks at the start of the study, often called an edit check speci cation. (The term data validation procedures is also common.) These speci—cations typically use a template built in a Microsoft Excel® spreadsheet with one row per check. The data manager uses the study database design (see Chapter 3) to identify the —elds and create checks using the edit check template as a guide. Most edit checks will be run automatically by the clinical data management (CDM) or EDC system being used for the study. A few checks will be performed manually, and others will be run outside of the data management system. In this chapter, we look at how edit checks are chosen and speci—ed. Chapter 8, “Cleaning Data,” describes how these edit checks are put into practice for a given study. (Author’s note: understanding the process of cleaning data can be helpful to understanding how edit checks are de—ned. Readers new to clinical data management should read Chapter 8 —rst as a background to this chapter.)

After the database is fully de—ned (although perhaps not yet built), the data manager will go through every —eld and determine what assumptions are to be made about data values in that —eld. Some of these assumptions will be enforced by the database itself and do not have to be de—ned as an edit check. For example, —elds de—ned as dates will automatically restrict values to valid calendar dates and coded —elds will restrict responses to the values in the codelist for that —eld. Most other assumptions on the data such as valid ranges will be programmed to run within the database or EDC system. Those that are dif—cult to program within the limits of the system will be run outside the database or EDC system in programs such as SAS®. A few checks require medical knowledge or other human insight and will be performed manually.