ABSTRACT

Along with this definition, we contended that the cultural factors that shape the process of thinking in test-taking are so complex that culture should not be treated as a factor to correct or control for, but as a phenomenon intrinsic to tests and testing. We argued that both test developers and test users should examine cultural validity with the same level of rigor and attention they use when they examine other forms of validity. The notion of cultural validity in assessment is consistent with the concept of multicultural validity (Kirkhart, 1995) in the context of program evaluation, which recognizes that cultural factors shape the sensitivity of evaluation instruments and the validity of the conclusions on program effectiveness. It is also consistent with a large body of literature that emphasizes the importance of examining instruction and assessment from a cultural perspective (e.g., LadsonBillings, 1995; Miller & Stigler, 1987; Roseberry, Warren, & Conant, 1992). Thus, although cultural validity is discussed in this chapter primarily in terms of largescale assessment, it is applicable to classroom assessment as well. This fact will become more evident as the reader proceeds through the book. In spite of its conceptual clarity, translating the notion of cultural validity into fair assessment practices is a formidable endeavor whose success is limited by two major challenges. The first challenge stems from the fact that the concept of culture is complex and lends itself to multiple interpretations-each person has their own conception of culture yet the term is used as though the concept were understood by everybody the same way. As a result of this complexity, it is difficult to point at the specific actions that should be taken to properly address culture. For example, the notion of “cultural responsiveness” or “cultural sensitivity” is often invoked by advocates as critical to attaining fairness (e.g., Gay, 2000; Hood, Hopson, & Frierson, 2005; Tillman, 2002). However, available definitions of cultural sensitivity cannot be readily operationalized into observable characteristics of tests or their process of development. The second challenge has to do with implementation. Test developers take different sorts of actions intended to address different aspects of cultural and linguistic diversity. Indeed, in these days, it is virtually impossible to imagine a test that has not gone through some kind of internal or external scrutiny intended to address potential cultural or linguistic bias at some point of its development. Yet it is extremely difficult to determine when some of those actions are effective and when they simply address superficial aspects of culture and language or underestimate their complexities. For example, the inclusion of a cultural sensitivity review stage performed by individuals from different ethnic backgrounds is part of current standard practice in the process of test development. While necessary, this strategy may be far from sufficient to properly address cultural issues. There is evidence that teachers of color are more aware than white, mainstream teachers of the potential challenges that test items may pose to students of color; however, in the absence of appropriate training, teachers of color are not any better than white teachers in their effectiveness in identifying and addressing specific challenges posed by test items regarding culture and language (NguyenLe, 2010; Solano-Flores & Gustafson, in press).