Creating Computerized Adaptive Tests of Music Aptitude: Problems, Solutions, and Future Directions
Standardized listening tests for assessing music aptitude have been available for almost as long as have standardized written tests of verbal and quantitative skills. Carl Seashore published his classic M easures o f M usical Talent in 1919, and new tests have continued to emerge with each passing decade (see, e.g., Bentley, 1966; Davies, 1970; Drake, 1957; Gordon, 1965, 1979,1982, 1989; Karma, 1985; Kwalwasser & Dykema, 1939; Long, 1965; Seashore, Lewis, & Saetveit, 1960; Stankov & Horn, 1980; Webster, 1992; Wing, 1948, 1961, 1970). One important lesson that we have learned from such tests over the years is that it is very difficult to measure music listening skill accurately in an efficient manner. There are several factors that account for this problem. One factor is an inherent limitation of any fixed-item test. To measure a broad range of skill accurately with such a test, there must be items targeted at all relevant ability levels. To accomplish this, the test must be reasonably long and there fore time consuming. The problem of test length is exacerbated further in many music aptitude tests because answers frequently focus on only two re sponse alternatives (e.g., whether two musical excerpts are the same or differ ent) . Because the probability of guessing correct answers is high for items with only two choices, test lengths have to be longer than those for items with more alternatives to achieve comparable levels of reliability.