Why Do We Get Things Wrong?

Companies employ stupid people. Daily we see that accidents are caused by negligent behavior. We always seem to focus on the poor sap who pressed the wrong button; who missed the warning sign; or who simply did the wrong thing. Do we conclude that there is an underclass of people who come to work to hurt themselves or others? Or are some people simply too careless to work? Or perhaps something else is going on? Nigel Heaton investigates.

As a society, we appear to like simple solutions. Someone is to blame. Once we eliminated mechanical and technical failure or deliberate sabotage, we blame the operator. It is ‘human error’. We are comforted by the idea that if we could just get rid of the stupid people and only employ the clever ones, all accidents would be eliminated. Organizations can be seen to act. “We are eliminating human error by training everyone and sacking anyone who makes a mistake”. If someone does something wrong it is clear where the blame lies. Yet after more than 120 years of post-industrial society, we still see a plethora of errors that range from the benign – I pulled that door and I should have pushed – to the tragic – he fell asleep at the wheel. Can it be true that if we just get rid of the 5% of the workforce who are too stupid or too lazy to follow instructions then errors and accidents would disappear?

The reality is that ‘human error’ is an excuse. It is mostly a symptom and rarely a cause. Much like an infection, if we don’t accurately identify the pathology, we cannot treat the disease. Worse, if we treat the symptom but not the underlying condition the problem will continue to reoccur and may even get worse. We might have experienced a near miss, a free lesson. The next time the incident occurs it might be a lot, lot worse.

We know that we commit errors every day. We drive our partner’s car and activate the windscreen wipers instead of the indicators. The outcome is not a large-scale disaster, merely an inconvenience. In fact, we commit errors much more often than we realize because the vast majority of errors don’t have any measurable effect at all. We can even make very significant mistakes repeatedly and nothing bad ever happens. It is only on the 100th occasion that the disaster befalls and we realize that the warning signs had been there all along if only we had to vision to see them.

What is human error?

The immediate cause of most errors is the person on the front line not doing the right thing. This gives rise to a common trait in most people – hindsight bias. There is a very obvious problem, the operator did not perform as expected therefore the operator caused the problem. We judge the accident by the result (what happened) and blame the direct cause of the accident (the operator). What is clear is that this often misses a much more interesting story and does not help prevent reoccurrence. We see hindsight bias in events such as the tragedy that killed 7 motorists on the M5. One individual, who was in charge of a bonfire, was charged with manslaughter because the smoke may or may not have been a contributory factor in the accidents. After the charge had been dismissed, the Counsel for the defendant stated that the police and local council were “motivated by desire to find someone to blame for this terrible accident, simply for the sake of it”.

We see a perverse reverse of hindsight bias in cases of ‘extreme’ heroism. For example, the behavior of Chesley ‘Sully’ Sullenberger demonstrates how a pilot, when faced with the most extreme conditions imaginable, is able to throw away the rulebook and perform the ‘miracle on the Hudson’. Imagine the way in which the authorities would have reacted if the wing had clipped the river and everyone had died. The official investigations would almost certainly have jumped on pilot error, just as they initially did in the case of the Kegworth air crash and the Mull of Kintyre helicopter crash. We like a simple explanation and a single point of blame. The truth tends to be much more complex.

We can simplify our concerns into three main areas:

The behavior of the person
The design and environment in which the error occurred
The overarching management system

For the purposes of this brief article, we will concentrate on the behavior of the person.

Why did they behave as they did?

Our understanding of what drives behaviors starts from identifying two underlying conditions immediately before an error. The person knew what they should be doing or they did not. The simplest and probably the rarest of errors is that the person did something because they did not have the correct knowledge. Sometimes this is due to a lack of training other times it is thinking that they have been trained but it is the wrong training. These knowledge-based errors are the easiest to correct. We simply provide the knowledge. It is worth noting that the only time that additional training is appropriate as a control to correct errors is when the root cause of the error is a lack of knowledge. Training or re-training people will not, per se, eliminate error.

If the error provoking condition is not a lack of knowledge, then operators know what they need to do. We can split an error when operators know what is expected of them into two. The first and most concerning error is that which is caused by forgetfulness – a slip or lapse. The operator simply makes a mistake – they were too busy, too distracted, under too much pressure. A slip is a ‘whoops’ type of error. I turned on the wiper when I meant to indicate. A lapse is caused by forgetfulness. I left my high visibility jacket over my chair when I ran out into the yard to see what is going on. These types of errors can be very costly. They are incredibly hard to control and can cost lives. Atul Gawande has produced his ‘checklist manifesto’ as one approach to ensuring that slips and lapses in routine safety-critical procedures are followed. His excellent four-part Reith lecture explaining the issue is still available on the BBC.

The final, and most common, cause of human error is the violation. Despite its rather unpleasant name, not all violations are equal. We find ourselves in a situation where the operators know the rules (so they do not need training), they do not forget (so a checklist will not help them); instead, they choose to ignore the rule.

Violations cover a wide spectrum of error provoking conditions. Sully risked his life and the lives of all his passengers because if he followed the rules they would all have died. More prosaically, Alan Chittock was suspended from his job when he ran onto the track to rescue a disabled woman who had fallen onto a live track in her wheelchair. Mr. Chittock was temporarily suspended for breaking the railway rules. In both cases, the violation was prompted by a desire to save lives. If the violation is a heroic action, we judge it by the intention and not the outcome (avoiding hindsight bias).

This type of behavior is at one end of the spectrum of violations. The other end is occupied by behavior that is designed to deliberately sabotage or cause damage to the system, the operator or others. There are many points along the spectrum that determine how ‘serious’ the violation is. Operators may behave in the way that they believe managers wish them to behave. “I know that they say they want us to follow the safe system, but if we did it would take all day, and they value speed above all else”. Operators may be tempted to take shortcuts to get home early or to hit production targets. They may work in the same way as everyone else, even if they know that this is in violation of the safe way of working.

Why do we care?

When faced with an accident, we want to find simple answers. We know someone is to blame. We blame those with the least power and least likely to complain – the injured party or the person who was the direct cause of the accident. We rarely look at the management system that provokes the error or the physical designs that make errors incredibly easy to make.

Human error is complex and is a statement of an external manifestation of an underlying condition. If we treat the wrong thing we risk the problem reoccurring. In the 18th Century, a common treatment for smallpox was to prescribe chocolate. It made the patient feel better, but unfortunately did little to prevent the development of the disease or make the disease less contagious. The doctor felt better as he was doing something and patients felt better (they were given loads of chocolate). It turns out chocolate was not a cure for smallpox.

In the same way, blaming operator error might make us feel better. Managers can be assured that it was just the stupid person at the coalface who made the mistake. And the next time? It was another stupid person. And again, and again.

We need to understand human error with a view to prevent history from repeating itself. We need to conduct robust investigations and challenge ourselves about the range of error provoking conditions. We need to build systems that are forgiving and ones that are robust enough to facilitate heroic behavior. We need to remember that to err is human and to forgive is divine, and after all, we’re only human.

For similar articles by Nigel Heaton, please click here.