ABSTRACT

The greatest residual risk in today’s safe aerospace systems is drift into failure. Drift into failure is a slow, incremental movement of systems operations towards the edge of their safety envelope. This movement is driven by pressures of scarcity and competition that subtly influence the many decisions and trade-offs made daily by operators and management hierarchies. The intransparency of complex sociotechnical systems that surround the operation of uncertain technology makes that people do not stop the drift (e.g. Perrow, 1984; Vaughan, 1996). Often they do not even see it. Accidents that lie at the end of drift are ‘the effect of a systematic migration of organisational behaviour toward accidents

under the influence o f pressure towards cost-effectiveness in an aggressive, competitive environment’ (Rasmussen and Svedung, 2000, p. 14). Drift into failure is hard to recognise because it is about normal people doing normal work in (seemingly) normal organisations, not about obvious breakdowns or failures or errors. Drift into failure is scary for all kinds of stakeholders because it reveals how harm can occur in organisations designed to prevent it. Drift into failure is also difficult to model and predict using current approaches in aerospace human factors. These are largely limited to a structuralist vocabulary. Our language of failures is a language of mechanics. We describe accident ‘trajectories’, we seek causes and effects, interactions. We look for ‘initiating failures’, or triggering events, and trace the successive domino-like collapse o f the system that follows it. This worldview sees sociotechnical systems as machines with parts in a particular arrangement (blunt versus sharp ends, defenses layered throughout), with particular interactions (trajectories, domino effects, triggers, initiators), and a mix of independent or intervening variables (blame culture versus safety culture). This is the worldview inherited from Descartes and Newton, the worldview that has successfully driven technological development since the scientific revolution half a millennium ago. The worldview, and the language that accompanies it, is based on particular notions o f natural science, and exercises a subtle but very powerful influence on our understanding o f sociotechnical success and failure today. Yet this worldview may be lagging behind the sociotechnical developments that have taken place in aerospace, leaving us less than well equipped to understand failure, let alone anticipate or prevent it. This paper looks at a case of drift into failure, and proposes how we may need new kinds of models to capture the workings and predict

Drifting into failure

The 2000 Alaska Airlines 261 accident is an example o f drift. The MD-80 crashed into the Ocean off California after the trim system in its tail snapped. Prima facie, the accident seems to fit a simple category that has come to dominate recent accident statistics: mechanical failures as a result o f poor maintenance. A single component failed because people did not maintain it well. Indeed, there was a catastrophic failure of a single component (a jackscrew-nut assembly). A mechanical failure, in other words. The break instantly rendered the aircraft uncontrollable and sent it plummeting into the Pacific. But such accidents do not happen just because somebody suddenly errs or something suddenly breaks: there is supposed to be too much built-in protection against the effects o f single failures. Consistent with the patterns of drift into failure, it were the protective structures, the surrounding organisations (including the regulator) that themselves contributed, in ways inadvertent, unforeseen and hard to detect. The organised social complexity surrounding the technological operation, the maintenance

committees, working groups, regulatory interventions and approvals, manufacturer inputs, all intended to protect the system from breakdown, actually helped set its course to the edge of the envelope and across.