ABSTRACT

The increasing power and sophistication of speech recognition and synthesis has encouraged speculation that human factors problems in implementing speech interfaces will diminish dramatically as technologically develops (Zue, Glass, Goddeau, Goodine, Hirschmann, Leung, Phillips, Polifroni, and Seneff, 1991). Advanced speech interfaces are currently being integrated into prototypes of civil and military cockpits (Gerlach and Onken, 1993; Onken, 1995, Turner, 1995; Turner, 1996; Steeneken and Gagnon, 1996). The reason for the introduction of speech-based interfaces is to increase the time available for head-up flight and, thereby, to improve flight performance and safety. The advantage which is claimed for delivering information via speech-based interfaces is a reduction in the vast quantities of information normally presented in visual displays in the cockpit and the release of the pilot from head down management of cockpit systems. In a simulated multi-task environment self-reports and performance at moderate to high levels of workload with multi-modal interfaces have shown that overall performance with speech-based interfaces is degraded. In particular the use of multi-modal interfaces resulted in degraded performance on tasks requiring extended processing of information and recall of information from memory.