ABSTRACT

Chapter 17 discusses the construction of psychophysical scales, which include measures derived from pair comparison and rank order scaling. The chapter points out that while valuable in and of themselves these methods serve as a foundation for today’s more common techniques of unidimensional and multidimensional scaling. Stimulus scaling is shown to bring the measurement logic of physics to the social laboratory, where unlike physics, there exist no obvious physical metrics for their measures (e.g., of beauty, taste, and preferences). How methods of scaling stimuli represent the combined judgments of large groups of respondents with high accuracy is discussed. The chapter emphasizes the distinction between stimulus scaling and the more usual scales of individual differences, as it is concerned with judgments of different stimuli, not with the persons who made the judgments. Methods of pair comparison and rank order scaling and the strengths of each are presented. The process of combining respondents’ judgments to provide an interval scale are explained. Occupying the final chapter section is discussion of the role of stimulus scaling in development of multidimensional scaling.

Developing and using scales of high utility, generality, and psychometric quality (i.e., of high reliability and validity) is a common part of the social researcher’s job description. As discussed in Chapter 16, questionnaires and rating scales are constructed to measure differences among individuals. For example, a scale of attitudes toward nuclear energy, assessed with multiple items, has as its purpose the classification of individuals along a continuum, so that those with the highest scores are defined as most in favor of adopting nuclear energy, and those with the lowest scores the least favorable. Those scoring in the middle are considered intermediate on their attitudes toward this energy source. In this form of scaling, the items are used to compute a summary score for the purpose of arranging people at some point on the scale: All items are designed to tap the same construct; that is, the items are assumed to differ only in terms of measurement error. So, in our attitudes toward nuclear energy scale, we assume that the questions (items) we pose to respondents, and that are used to create our summary score, all tap the same underlying construct (in this case, attitudes toward nuclear energy). Measures of this variety are called tests of differences among individuals, or more commonly, tests of individual differences. Individual difference assessments are the most common form of measures used in contemporary social research.