ABSTRACT

Anyone who has taught survey construction or attempted to construct such a scale has encountered the numerous options available for developing response options. These options are accompanied by a bewildering range of opinions about what the best options are. Often a simple ‘yes/no’ suffices for responses, sometimes a frequency scale (‘never/sometimes/often/always’) is called for, and, almost certainly, a Likert-type scale (‘strongly disagree’ to ‘strongly agree’) is considered, if not automatically chosen as the default. What many fail to consider is the critical role of the response options-both the number of categories and their definitions-on the quality of the data collected. Any ambiguity in the questions and scales attenuates reliability, which in turn affects validity. We must keep validity forefront in our minds at every stage of any test-construction process, but one key place where this is often overlooked is in rating scale construction. Validity arguments should be addressed throughout the entire process and begin with the logic and theory behind the construction of the survey items. And remember that validity, unlike reliability, is not established by reference to some simple statistic. Validity is an argument that entails judgment about meaning. Meaning stems from theory: from the type of questions we ask, the way that we ask them, of whom we ask them, and how we summarize and interpret the data. Thus, how rating scales are constructed-right from the very beginning-matters greatly. Rating scale construction then should be viewed as the researcher’s one-shot chance at communication with the respondents as they attempt to use the response restrictions we imposed to record their attitudes, behaviors, or achievements on the construct of interest.