How can we measure Operational Resilience in a clear and meaningful way?



Context: the need to measure
By now firms should be acting within the transitional phase of Operational Resilience, which is set out by the PRA and FCA and runs until March 2025. Time, effort, and investment has brought us to the point where firms know their Important Business Services (IBSs), have set impact tolerances and undertaken scenario testing. Yet even with all this work, we are not ‘done’. Now that all the key elements are in place and BAU embedding is in progress, the question remains “How can we measure Operational Resilience in a clear and meaningful way?”.
Having sufficient oversight of resilience is key for FS firms to facilitate realistic and meaningful decision making. To this end, the PRA states that “While individual Board members are not required to be technical experts on Operational Resilience, it is expected that they have the appropriate management information.” [1]. Good MI is needed so that Board members can appropriately challenge senior management on decisions that have an impact on Operational Resilience.
Furthermore, the FCA states that “supervisory authorities will consider developing further policy requirements in the future, including reporting. Operational resilience regulatory reporting would provide higher quality incident reporting and would allow the supervisory authorities to identify more effectively risks from operational failures, including IT failures.”[2] Until these further requirements are published, organisations can take a principles-based approach to establish how to make a success of measuring Operational Resilience and, in turn, enable effective decision making and derive improvement insights.
Conceptualising Operational Resilience MI and tackling what can be done to develop it?
To make this seem more relevant to everyday life, consider Operational Resilience as a car. The dashboard within a car provides the driver with key information on performance in real time, for instance the petrol level and speed. The monitoring of Operational Resilience through metrics and tolerances is the dashboard to provide a company with leading and lagging indicators of performance on a frequent basis.
Cars also require an MOT, which is more of a point in time assessment, used to take a step back and check roadworthiness and identify areas for attention. Comparably, the self-assessment conducted for Operational Resilience can be regarded as an MOT. Defining IBSs, setting impact tolerances and consolidating scenario testing outputs, are checks which need to be conducted are part of the MOT in the form of a self-assessment. Furthermore, using severe but plausible scenarios will highlight any gaps or potential issues which may crop up in the future. For instance, checking your brake pads during an MOT and ensuring they work as expected is comparable to checking your recovery time objectives for system outages.
Although an MOT/self-assessment does not need to be performed daily, it is needed periodically to ensure ongoing safety and soundness. The more regular car dashboard/MI dashboard is the complementary ‘running as expected gauge’ to determine performance. Both aspects work together to help measure overall Operational Resilience.
As with many similar obstacles, when the term ‘Reporting and Management Information (MI)’ are thrown into the Operational Resilience mix, panic often comes with the idea that we must start from scratch. However, by taking a step back and looking at the topic with a different lens, we can address it by considering three broad yet key steps:
- Defining and sourcing meaningful metrics
- Setting appropriate thresholds
- Visualising information in a digestible way
Defining and Sourcing Meaningful Metrics
The key to good MI is having a set of clear and meaningful metrics. In the terms of Operational Resilience, a meaningful metric is one which helps inform IBS resilience performance against impact tolerance. This doesn’t mean we need to reinvent the wheel and look at a whole new set of data or sources. In fact, in many cases, we can leverage existing risk, control, operational performance and incident metrics that pre-exist within organisations.
Determining what is appropriate to measure can be a daunting task. However, within Operational Resilience we are lucky this has been done for us to some extent. By aligning the metrics a firm wants to measure to the Operational Resilience resource pillars (e.g. technology, facilities, people, information, third parties), firms can stay on track with their agreed Operational Resilience objectives.
Some examples of metrics which can be used across the typical pillars include:
Setting Appropriate Thresholds
When using MI to monitor Operational Resilience, using thresholds can enable an onlooker to identify – at a glance – what good looks like. This helps those wanting to track their Operational Resilience proactively, i.e., when should alarm bells start ringing prior to the impact tolerance being breached?
Thresholds should be set at the metric level and can be rolled up to resource pillar level and IBS level (with appropriate weighting/aggregation), allowing for a clear and meaningful insight into the resilience of the firm. Setting thresholds can be seen as an intimidating task but many companies, if not all, already have a starting point. Service level agreements or internal policies can be used as a starting point for setting thresholds.
A simple example of this would be looking at the number of SLA breaches for a third party within the last 12 months. The thresholds for this metric could be articulated as:
- Green – Less Than 2 SLA breaches (within the last 12 months)
- Amber – Between 2 and 5 SLA breaches (within the last 12 months)
- Red – Greater than 5 SLA breaches (within the last 12 months)
Using a time bound measure to create trends over time provides more insight into how a resource is performing over a given period. However, it is not enough just having these thresholds and metrics in place. Companies need to continuously assess these to make sure they are still appropriate to effectively measure what they have been chosen to measure. Annual metric/threshold reviews could be a good baseline frequency and companies should be adaptable to making changes.
Visualising Information in a Digestible way
Once you have clear and meaningful metrics and have set appropriate thresholds, the thought of how to present these arises.
We are fortunate that as technology advances there are many visualisation tools in the market which can help us do such a job. Using Business Intelligence tools can break down the data into bitesize pieces of information and lay them out using different dashboards. This, if designed with the end user in mind, can allow reporting to be more targeted, digestible and impactful. Regardless of whether the outputs show good results or areas for improvement, the reporting enables firms to demonstrate control over how their IBSs are performing and drives management action to correct problem areas.
Yet before we get to this stage, companies must overcome the tricky part of ensuring the data they are using is complete, accurate and of a good quality. The use of Business Intelligence tools is great at utilising data sets to provide multiple perspectives to enable insightful decisions. However, the ultimate product of a Business Intelligence tool is only as good as the data it leverages.
Business Intelligence tools have the capability to aggregate data upwards. For instance, cutting data to show an IBS view per resource pillar would provide an IBS owner with the tools needed to help them understand how their IBS is performing. On the other hand, data can be cut to show pillar information across IBSs. This allows resource owners to know how their areas are performing and target action where required. It is useful to then go on to show the firm’s overall resilience at an IBS, pillar and consolidated level. Tracking this type of data over a period allows those at the ‘top of house’ to achieve a better understanding of how the firm is operating as a whole. This can and should then go further to be used as the basis to make more insightful decisions relating to the strategic running of the business and to feed into decisions concerning areas for investment.
Conclusion
Measuring a company’s resilience does come with its challenges. This includes making sure your data is clean and accurate, ensuring you have the right metrics (with appropriate thresholds) selected and having this information visualised in a digestible manner to facilitate decision making. However, by using the three steps outlined above, firms can start to successfully develop clear, meaningful, digestible, and actionable Operational Resilience reporting. This method of MI development will allow firms to get their feet under the table before further regulatory reporting guidance is prescribed. And when this potential regulatory reporting legislation does come about, the approach outlined above will ensure firms are in a strong position to be able to adapt and tailor what is already in place to align with new reporting requirements and without having to start from a blank slate.
Ultimately, well defined Operational Resilience management information will significantly help firms to measure how their IBSs are performing, provide a gauge on the resilience levels of key resources and help minimise the impact of disruptions to customers, the firm and wider market.
[1] SS1/21 ‘Operational resilience: Impact tolerances for important business services’ (bankofengland.co.uk)
[2] Building operational resilience: Impact tolerances for important business services (bankofengland.co.uk)