ABSTRACT

Program evaluation is not someone's personal opinion about a program or someone's casual observations about a program or even observations based on journalistic or managerial familiarity with the program. Defensible observations are those collected in a systematic manner. 'Systematic' means that the process by which the observations are collected and analyzed is both replicable and valid. The four types of validity are internal validity, external validity, measurement validity, and statistical validity. Internal validity refers to the accuracy of causal claims. A relatively straightforward way to increase the reliability of measurement is not unrelated to the way to increase statistical validity. Measurement reliability and validity are problems for the validity of evaluation studies for several reasons. Because of the connection between systematic NRME, RME in program variables, and internal invalidity, separating measurement reliability and validity, the topics of this chapter, from internal validity, the topic of the next chapter, is rather artificial.