ABSTRACT

In this chapter, a summary of types of reliability used with continuous interval (scale) data collected from self-administered instruments (and other types of measurement instruments) based on classical test theory (CTT) and item response theory (IRT) is presented. Next, CTT and IRT methods for analyzing item responses with item as the unit of analysis and interpretation and how to assess consistency in response to categorical items are reviewed. Finally, in the case when ratings are generated by observers and not the participants themselves, methods of computing consistency of ratings across raters are explained. Reliability is consistency of response across persons or items, and is important primarily because higher reliability gives us greater faith in the validity of our results. We choose the type of reliability of most interest based on the purposes of the survey. Regardless of which of the four types of reliability are used for a research study, the standard error of measurement can be computed and reported. In situations in which consistency of ratings provided by different raters is to be determined, percent agreement or Cohen’s Kappa for categorical variables and Pearson correlation coefficient and ICC for data that are interval can be computed.