Principles for the Use of Formal Methods in Accident Investigations

Linking User and System Models to Analyse the Causes of Major Accidents

- Summary Report EPSRC Grant GR/L27800 -
Chris Johnson

Department of Computing Science, University of Glasgow, Glasgow G12 8QQ.

johnson@dcs.gla.ac.uk,

http://www.dcs.gla.ac.uk/~johnson

Accident reports often conclude that operator interventio n exacerbates the problems created by systems failures. Other r eports have described the ways in which human interaction can also mitigate some consequences of major failures. It is, therefore, worrying tha t few techniques can be used to analyse the interaction between systems failures and operator intervention (Snowdon and Johnson, 1999).

Modelling Knowledge-Based Errors

The starting point for my work was to demonstrate that episte mic logics provide a link between the formal methods of systems engineers and th e user models that have been developed in cognitive psychology. Fagin, Halpern, Moses, and Vardi (1995) introduce a number of epistemic operators: E_group (read as everyone knows), C_group (everyone knows that everyone knows) and D_group (information is shared amongst the group). These operators were important for my early work because shared assumptions often lie at the heart of major accidents. For example, in one incident the Departures officer did not anticipate any conflict between SAB603 and BAW818 because they believed that their colleagues would divert SAB603 to the left away from BAW818. "The assumption was that SAB603 would be turned left away from the departure runway and Air Departures, acting on this expectation failed to inform Air Arrivals of the second departure i.e., BAW818" (AAIB, 1998):

E_atc_officers(alter(arrivals, sab_603, left))

The previous clause states that ATC officers know that SAB 603 is requested to alter course to the left. The near miss stemmed from the Arrival Officers decision to alter the course of SAB 603 to the right. This decision went against normal procedure but "Air Arrivals decided to turn SAB603 right because he considered that a left turn would have been a possible confliction to the Midhurst departure (AFR 813) and a right turn would cause less disruption as he was not aware of the Brookmans Park departure (BAW818)" (AAIB, 1998):

Kn_arrivals (conflict(sab603, 27l, afr813)^ not Kn_arrivals (conflict(sab603, 27l, baw818)) =>
alter(arrivals, sab_603, right)

The interpretation of the epistemic operators relies upon Kripke semantics. Further details are provided in (Johnson, 1997, 1999). I have used the formal underpinnings of this existing notation to construct a number of proofs, for example to show that accident reports often contain contradictions about what operators did and did not know about the systems that they were operating (Johnson, 2000). Although epistemic notations provide powerful support for analysing these knowledge based errors, they provide little or no support for analysing the physiological and perceptual factors that contribute to operator intervention in many major accidents. I, therefore, began to focus on Barnard’s Interacting Cognitive Subsystems (ICS) model. ICS has a diagrammatic form that can be converted back into an underlying textual representation of a logic-based model (Duke, Barnard, Duce and May, 1995). I now had to show how the ICS technique, that had deliberately been developed to model normal or expert performance, could be used to model the human contribution to accidents.

2.4 Modelling Skill-Based Errors

My initial modelling had been largely driven by inferences about the cognitive influences that led to the operator behaviours, which are described in accident reports. I had no means of directly accessing personnel to determine whether my interpretation was correct. Some of these issues were resolved in collaboration with David Wright, a consultant at the Western General Infirmary in Edinburgh. He had pioneered the use of reporting procedures as a means of addressing crritical incidents in healthcare provision. The following list summarises the contributory factors that were identified for incidents over a twelve-month period (Busse & Wright, 2000):

Thoughtlessness: 11
Poor Communication: 8
Inexperience with Equipment: 4
Night Time: 3
Failure to Check Equipment: 3
Failure to Perform Hourly Check: 2
Endotrach. Tube Not Properly Secured: 2
Poor Equipment Design: 1
Patient Inadequately Sedated: 1
Turning the Patient: 1

Our intention was to go beyond high-level categories, such as thoughtlessness, to build more detailed models of common failures. For example, Figure 1 uses an ICS model to show how a skill-based error can lead to a dislodged endotracheal tube. The fixing of the tube in the larynx by pumping air into the cuff is carried out on a skill-based level, not requiring knowledge-based processing as would be provided by the implicational subsystem. Instead, body state information from the proprioceptive subsystem (1) is sufficient to enable the propositional subsystem (2) to interpret the state of the tube and to send motor movement information to the limb subsystem (3).

Figure 1: Skill-based Error during the Insertion of an Endotracheal Tube

Alternative Explanations of Human Error

ICS provides a framework that can be used to reason about alternative hypotheses for the underlying cognitive processes that lead to an endotracheal tube becoming dislodged. For instance, the previous paragraph assumed that securing the tube via the air cuff was as a skill-based task. There are, however, exceptions to the general procedures that govern the insertion and securing of the tube. Usually, staff will re-secure the tube by pumping air into the cuff. With some types of device, however, this has a counter-productive effect. Pumping air into the cuff forces the cuff and, therefore, the endotracheal tube, further out of the larynx. Such exceptions create a risk of mistakes at the rule-based level of performance. Operators often opt for a well-known and practised rule even when there may be signs that such an approach is not justified (Reason, 1990, Leape 1998). Figure 2 illustrates the cognitive causes of stereotypical behaviour under high workload. The processing of an exceptional requires increased cognitive processing. The body state information (1), being passed on to the propositional subsystem (2), now only presents a small aspect of the cognitive demands posed upon the human. The operator must determine which make of tube is involved and so the interpretation of the body-state information needs to draw on implicational input (3).

Figure 2: Problems at Higher Levels of Cognition during the Insertion of an Endotracheal Tube

Grounding the Analysis of Human Error

Many accident and incident reports diagnose the apparent causes of human error without any supporting method or justification. As a result, several recent studies have shown that individual investigators cannot replicate the findings of their colleagues (Lekberg, 1997). Different investigators assign a multitude of different causes to the same incidents. An important motivating factor behind my work has been that techniques, such as those shown above, might help to document the reasons why investigators diagnose particular causal factors behind operator error. My intention has not been to attempt to ensure a consistent interpretation of these causal factors but rather to ensure that human factors analysts provide a clear and thorough explanation of the grounding for their decisions.

Going Beyond Simple Error Taxonomies

The attribution of error is a fundamental concern in my work. Bodies such as the International Civil Aviation Organisation (ICAO) require that this be done without blame to the individuals concerned. The focus of accident and incident investigations must be to improve safety. However, the attribution of error is often a social and psychological process rather than a matter of objective fact (Snowdon, 2000). This is graphically illustrated by figures 1 and 2, which presented alternative explanations for the problems with the insertion of endotracheal tubes. These alternative interpretations are often not addressed in the analysis of human `error'. In particular, incident reporting systems often only assign a single causal factor to each of the occurrences that they examine. My work in this project has rejected this view and instead sought to look at each incident or accident as the starting point for a more detailed analysis of the causal factors that lie behind categories, such as `thoughtlessness' (Busse and Johnson, 1999).

Minimising Attribution Errors

Many professions, and especially clinical staff, are trained to accept responsibility for their own action. This creates the opportunity for attribution error; individuals are significantly more likely to attribute error occurrence to situational aspects when the `error' was committed by themselves. However, when looking for reasons why others were involved in `error', one is most likely to blame the person rather than these situational aspects. I have argued that system and user modelling techniques help to avoid these problems by providing a record of the reasons why certain conclusions were reached about the causes of human `error'. However, my fieldwork has not enabled me to identify any incidents or accidents in which analysts were directly implicated. Without further investigation it is, therefore, difficult to be sure that my techniques would reduce attribution error.

Supporting Observational Studies

More recently my work on this project has begun to focus on ‘pathological’ incidents and accidents that are not easily amenable to the techniques presented in this report. For example, a recent study revealed radically different test results for the level of distress experienced by babies during delivery in two sister hospitals. Initially, it was suspected that poor staff training and procedures were to blame. It was subsequently discovered that one hospital only performed the test IF they suspected that the baby was distressed. This led to a higher percentage of positive test results than the other hospital that routinely tested all infants (Harris, Jagodzinski and Greene, 1999). This incident illustrates how many incident and accidents have managerial and organisational causes that are not easily modelled using existing analytical techniques. As a result of these observations, one member of this project qualified as a healthcare assistant. They then used cognitive modelling together with workplace observations to examine the relationship between individual human `error mechanisms' and the surrounding context of work. The results of this practical combination of analytical and observation studies are to be the focus of their PhD thesis (Busse and Wright, 2000).

Space restrictions in this document have forced me to focus on the healthcare work driven by the RA, Daniela Busse. However, Peter Snowdon, the project student, pursued a similar combination of practical and theoretical work on user and system modelling with Gordon Crick at the UK Health and Safety Executive. Peter’s focus was on producing diagrammatic techniques that enable HSE inspectors to explicitly represent the justifications that support their causal analysis of human `error' and systems `failure'. These techniques have some common features with the approach that was pursued by Daniela Busse but there are also differences. In particular, his approach had to be accessible to HSE inspectors with only a minimal training in human factors and with no knowledge of ICS. This work has resulted in a software prototype for the causal analysis of incidents and accidents. This was developed in conjunction with the HSE and is available to the research community on:

http://www.dcs.gla.ac.uk/~snowdonp

References

Air Accident Investigation Branch, Report on the Incident near London Heathrow Airport on 27 August 1997, Aircraft Incident Report Number 5/98, HMSO, London, 1998.
D.K. Busse and D.J. Wright, Classification and Analysis of Incidents in Complex Medical Environments, Health Information Management Journal, (20)4:1-12, 2000.
D.K. Busse and C.W. Johnson, Human Error in an Intensive Care Unit: A Cognitive Analysis of Critical Incidents. J. Dixon (ed.), 17th International Systems Safety Conference, Systems Safety Society, Unionville, Virginia, USA. 138-147, 1999.
D. Duke, P. Barnard, D. Duce and P. May, Systematic Development of Human-Computer Interfaces. APSEC’95: 2nd Asian-Pacific Software Engineering Conference, 1995.
R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning About Knowledge. MIT Press, Boston, USA, 1995.
M. Harris, A.P. Jagodzinski, K.R. Greene, Human Error in the Context of Work Activity Systems. In C.W. Johnson (ed.) 1st Workshop on Human Error in Clinical Systems. Dept. of Computing Science, Univ. of Glasgow. www.dcs.gla.ac.uk/~johnson/papaers/HECS_99
C.W. Johnson, The Epistemics of Accidents. Journal of Human Computer Systems.(47)659-688. 1997.
C.W. Johnson, The Application of User Modeling Techniques to Reason about the Human Contribution to Major Accidents. In J. Kay (ed.), 7th International User Modelling Conference (UM'99), Springer Verlag, New York, USA, 13-22, 1999 .
C.W. Johnson, Proving Properties of Accidents, Reliability Engineering and Systems Safety, (67)2:175-191, 2000 .
L.L Leape, Promoting Patient Safety by Preventing Medical Error. JAMA: The Journal of the American Medical Association, 1998. 280(16): p. 1444-47.].
A.K. Lekberg, Different Approaches to Incident Investigation: How the Analyst Makes a Difference. In A. Smith and B. Lewis (eds.) 15th International Systems Safety Conference, 178-193, Systems Safety Society, Unionville, VA, USA, 1997.
J. Reason, Human Error, CUP, Cambridge, 1990.
P. Snowdon and C.W. Johnson, Survey into the Usability of Accident and Incident Reports. In J. Noyes and M. Bransby (eds), People in Control, Institute of Electrical Engineers, London, 258-262, 1999 .
P.Snowdon, The Impact of Rhetoric on Accident reports. In L. Benner (ed.), Accident Investigation Roundtable, 2000. See http://www.iprr.org/.