Insights from Across Industries


Things will go wrong – a key issue for operational resilience is how we respond when they do. To be able to respond well to operational failures requires insight, preparation and practice, as well as an ability to learn from errors. This is easier in a culture where staff acknowledge mistakes, there is an effective system for staff to record them, and the organization is open to learning from them.

In a joint 2018 discussion paper, three UK regulators (the Bank of England, the Prudential Regulation Authority and the Financial Conduct Authority) noted that the financial system needs to be able to absorb shocks, rather than contribute to them. “The financial sector needs an approach to operational risk management that includes preventative measures and the capabilities – in terms of people, processes and organizational culture – to adapt and recover when things go wrong,” the regulators elaborated.

It's helpful to recognize that there are different types of failure that warrant different types of responses. In a paper Dr. Amy Edmondson wrote for the Harvard Business Review, she cites three broad categories of failure: preventable, complexity‐related and intelligent.

  • Preventable failures in predictable operations are “bad.” These involve deviations from routine procedures in, say, a manufacturing process. Some firms have built continual learning into their production processes to ensure continuous improvement.
  • Unavoidable failures in complex systems arise when there is a high level of uncertainty in the work environment, for example when triaging patients in a hospital emergency room or running a fast‐growing start‐up. Even minor process failures in these circumstances can – in combination – lead to catastrophic failures. As she notes, “To consider them bad is not just a misunderstanding of how complex systems work; it is counterproductive.” Small process failures are inevitable, but avoiding consequential failures means rapidly identifying and correcting them.
  • Intelligent failures at the frontier, in contrast, can be considered as ‘good,’ as they can help firms discover new drugs, new products and new ways of doing things that provide a competitive edge. One of the interesting features of today's financial services industry is the clash of cultures between the ‘fail fast, learn fast’ mentality of innovative fintechs and the more traditional approach of ‘avoid failure, and then look for who to blame when things go wrong.’

Disaster Prevention Challenges

From an operational resilience point of view, paying attention to preventable (even quite minor) failings makes a lot of sense, as these can in combination trigger a catastrophic process failure. Evidence of this has been found in several public enquiries investigating the causes of various disasters.

Consider, for example, the UK, enquiries into the Bradford Football Club fire, the sinking of the Herald of Free Enterprise and the Kings Cross Underground fire. What did all three of these disasters have in common? All were the culmination of a number of smaller events, including design and management deficiencies.

Each disaster could have been averted – or, at the very least, mitigated – if the smaller trigger events had been identified, reported and tackled. The general lesson is that an industry that does not learn from past failures is doomed to repeat them.

In the case of the enquiry into the Bradford fire, the report highlighted that many of the safety‐related recommendations were in fact identified in previous reports into other football‐related disasters, but had not been put in place. In other words, the industry did not learn from previous failures.

The 1987 sinking of the Herald of Free Enterprise ferry, which resulted in the loss of 188 lives, was another classic example of not learning from previous minor incidents. Prior to the disaster, several minor incidents had been noted by members of the ferry's crew – but were either not officially reported or dismissed by management as ‘exaggerations.’ In the ensuing Sheen investigation, management failure was cited as a prime reason for the disaster.

This practice of ignoring, dismissing or marginalizing previous related incidents is part of a larger, even more troubling pattern of behaviour. In the wake of a major disaster, people take much more care, their attitude to risk changes and policies and processes are introduced to prevent reoccurrence. However, these efforts can soon dissipate, allowing things to revert to a norm in which trigger events were missed, ignored or not reported properly.

So, even if we know it makes sense to learn from failure, we can see it's hard.

Read More


Creating Effective Incident Reporting
By Dr. Mike Humphrey

The Critical Connection Between Culture and Misconduct Failures

By Caroline Stroud, Emma Rachmaninov, and Holly Insley

BylawsCode of ConductPrivacy NoticeTerms of Use © 2023 Global Association of Risk Professionals