Jun 16, 2025

Emerging from the Muddle of Matrices

Key Takeaways

Risk matrices: Risk matrices are more compliance-driven tools and often fall short of effectively managing risks.
Modeling risk: True risk management involves structuring uncertainties and creating models, even simple ones, to guide decision-making.
Practical risk model: A simple yet robust risk model is necessary for assessors to provide meaningful input and for supporting enterprise-level decisions.
Focus on economic loss: The primary concern is understanding economic loss, particularly its impact on the bottom line.
Aligning with governance structures: Proper alignment of risk assessments with governance structures ensures better resource allocation and informed decisions across the organization.

Deep Dive

In this article, Graeme Keith dives into the limitations of traditional risk matrices and presents an alternative approach to risk management. By exploring the need for a model that better aligns with real-world decision-making, Keith highlights the shortcomings of compliance-driven exercises and offers a framework that allows businesses to better assess and prioritize risks across the enterprise.

The Shift from Risk Matrices to Actionable Models

There is an alternative to risk matrices, but it's not asking everyone in our organization to start modeling their risks.

We should start doing that too, because the only way to manage risk is to structure our understanding of the uncertainties that stand between our decisions and the objectives they influence in a way that allows us to leverage data. And the only way to do that is to build a model – even if it's just a very simple one. But practical modeling is a craft; it takes time to build that expertise, and risk matrices were never about managing risks in the first place (Heaven forfend).

Risk matrices were part of a compliance-driven exercise to demonstrate that organizations:

Identify and understand the uncertainties that influence their objectives
Assign accountability for managing those uncertainties to people mandated to do so
Establish a framework in which that management can be assured independently
Assess those uncertainties in order to prioritize investments and distribute resources to manage risk across the enterprise.

Unfortunately, as I argued last week, risk matrices are not remotely up to the role they're supposed to play in this undertaking. So what do we do instead?

Risk matrices were never intended to manage risk at the level of the individual risk, but only to assess them as part of an exercise to articulate and manage risk at the level of the enterprise.

The only way to manage risk, at any level, is to structure our understanding of the uncertainties that stand between our decisions and the objectives they influence, i.e., to build a model. So what does this enterprise-scale model look like? And how can we use that model to construct an alternative to the risk matrix?

The Model

The challenge is to build a model that is simple enough at the level of the individual risk that the people who today assess risks using risk matrices are able to provide meaningful input to that model using the same data in the same meetings under the same workflows, while at the same time ensuring that the results of that model are robust enough to support the decisions that model is designed to support.

Scope

We want to replace risk matrices, so we're looking at uncertainties where it makes sense to ask, "How often does this happen?" or, "How likely is this to happen?" and "How bad is it when it does?". So we're only looking at uncertainties that depend on the occurrence of an event. This includes the overwhelming majority of operational risk including cyber, compliance and HSE, but excludes parametric uncertainties like market prices, inflation, interest rates, FX, and so on.

Objectives

We're essentially looking at operational risks and our primary concern is economic loss – additional costs or reductions in revenue – and their effect on the bottom line. So our primary objective is some measure of profit.

When considering losses, the total impact of multiple risks is found simply by adding the individual impacts. This is a key requirement for the method by which we will synthesize individual assessments to generate both an understanding of the distribution of risk across the enterprise, as well as individual risks' contribution to the overall risk profile.

We can also look at other additive metrics – lost time incidents and fatalities in a safety context, for example, number of customers in a commercial context, production in extraction and manufacturing contexts. I can also introduce additive utility scales to capture other dimensions of impact like environment and reputation. These allow me to extend the objectives under consideration to include safety and various strategic objectives, but the point of departure is the effect of risks on the bottom line of our P&L.

Decisions

We are not here meddling with the management of individual risks. Our decision levers are at the enterprise level and are four-fold:

Budget: The distribution of resources to specific business units, product lines, functional areas, and so on, in order to spend money where it makes a difference and save money where it doesn't. For example, increasing spend on safety and reducing spend on cyber; or providing additional support in challenging theatres of operation.
Portfolio: Selling out of businesses whose exposure to potential loss isn't warranted by the profit they're chasing. Divesting out of volatile projects. Buying into opportunities where our exposure to risk is lower than our competitors.
Provisioning: Ensuring capital is available, for example, to assure project success at a portfolio level.
Targeted transformation: Investments in changes in practice. For example, global endeavors to improve driver safety.

Targeted transformation requires us to be able to prioritize risks coherently and correctly.

Budget, portfolio, and provisioning all require us to align our model with the governance structures in place for portfolio management and for distributing resources across, for example, business units, functional areas responsible for different risk categories, or for different processes for protecting value and product lines. We will assume that these structures are in place and are coherent in the sense that any potential loss can be assigned to one and only one place in each structure.

Aligning enterprise-wide risk assessments to enterprise level decisions then simply becomes a question of assigning the risks to these structures and then adding all the risks belonging to, say, a business unit or a risk category or a product line together.

We can track the resulting metrics through time, we can investigate how much different control strategies change exposure, we can examine differences in exposure under different scenarios and, crucially, we can compare business units, risk categories and product lines to each other in order to determine, for example, where we should invest and where we might save.

The Model

The decision-actionable insight we are looking for from this model is in the aggregated loss from a large number of risks or the individual contribution to this aggregated loss from individual risks.

While any given risk may or may not occur and may have a wide range of loss outcomes if it does occur, when you aggregate a number of risks together, some of them occur, some of them don't, and some of those that occur come in high, and some low. As we add more and more risks, these variations tend to cancel each other out and the resulting uncertainty becomes less and less volatile (relatively) and more and more smooth. All the individual gnarly detail of each risk gets washed out in the addition and we're left with an uncertainty that is relatively accurate, smooth and completely described by the broad features of its shape: the average loss, the amount of volatility and the skewness – which roughly speaking captures how much worse bad outcomes are than ordinary outcomes.

Because the detail of individual uncertainties is lost in the aggregation wash and we are only really interested in the aggregation, we don't need to care very much about the details of individual risk assessments. We need only to capture risks' contribution to the numbers that describe the final aggregated distribution: the average, volatility and skewness. There is some subtlety in choosing a very simple model aligned with very simple questions aligned with the intuition of subject matter experts that captures exactly the relevant information, but ultimately the risk model boils down to three clear and simple questions:

How often, on average, does the risk occur annually?
When the risk occurs, we recognize there can be a large range of outcome losses: Where is the middle of that range? And what is a credible downside to that range?

These questions replace the heat map.

The Model in Action

Evans Enterprises (made-up company) has replaced its heat map assessments with this very simple quantitative assessment methodology. Evans is a multinational company with a number of divisions covering oil and gas exploration and production, mining, weapons manufacturing and water utilities.

Article content — *Evans Enterprises Entity Structure*

Risks are assessed across the organization and assigned both to the business unit (or division) in which they are managed and also to a risk taxonomy.

To compare, for example, the risk exposure across risk categories, we add up all the risks under business disruption (in all the business units) and compare that to the sum of all the risks under IT (in all the business units) and so on.

We're actually adding uncertainties here, and even if the aggregated uncertainty is smaller (relatively) than the uncertainty on the constituent risks, the result is still itself an uncertainty. Normally, we'd use Monte Carlo simulation for this, but because of the simplicity of the model we can do this analytically – these results are all produced in vanilla Excel and the difference from a Monte Carlo result is negligible.

We can see in this figure both the expected loss (blue dash – a probability weighted average of all the potential loss outcomes), but also the range of losses (the grey blocks are showing an 80% confidence interval) as well as the average of the worst 10% of loss outcomes (red diamond), which we use to measure what a bad year looks like for each category.

We can see, for example, that although Compliance loses, on average, more than QHSE, the losses are more predictable and the tail risk – a bad year – isn't as bad.

Now the model has no account of the correlation between risks, so while the expected losses might be reasonably robust – at least good enough for making comparisons – we should take the volatility with a pinch of salt in absolute terms. Nonetheless, the tail risk is providing us important, decision-actionable information about the risks in each category.

Risks that don't happen very often, but that hit hard when they do, contribute more to volatility and tail risk than risks that happen often, but that hit mildly. This is important because we might be controlling a risk, not because it's costing us year on year, but because in the event it does go south, it would otherwise hit us very hard. Keeping an eye on the tail prevents us from saving money on controls that don't affect expected losses, but that do affect the tail.

A way of seeing this at individual risk level is to plot the average rate at which risks occur against the average loss they incur when they do.

When these are plotted logarithmically, the blue dashed horizontal lines correspond with lines along which the annual expected loss is constant. On a given line, the risks to the upper left are the high rate, low impact that move the needle on average losses in the aggregate, but not so much on the tail. The lower right contribute disproportionately to the tail.

This figure has all Evans' risks, but it is a simple matter to filter for a given business unit or risk category. Say I was the managing director of Chase Copper – a business unit in Evans' mining division. Filtering for Chase's risks, the quantitative heat map looks like this, with the numbers corresponding to an easily searchable and sortable list of risks.

The comparison between risk categories now looks like this

Chase clearly has some compliance issues (although it's much less sensitive to cyber risk than the company as a whole). If we wanted to see how Chase compared on Compliance with the other business units in the mining division, we can filter for Compliance risks and look across the business units:

Yep, Chase really has an issue with Compliance. To see where that's coming from I can filter on Chase and look across the risk sub-categories.

The Compliance sub-categories are regulatory breach, crime, conduct, and products. Clearly, crime doesn't pay.

All of this insight and information is coming with a couple of clicks in an Excel sheet. (The distributions are recalculated instantly because of the analytical convolution method.) Furthermore, this information is all built out of an extremely simple assessment methodology at the individual risk level, comprising just three numbers corresponding to the questions posed above.

It is a simple matter to build tools that allow SMEs to understand the significance of their choice of parameter in terms of the uncertainty they are trying to represent. For example, this interface shows the uncertainty in both the number of occurrences and the loss per occurrence and allows users to adjust the parameters until those uncertainties correspond to SME intuition on those variables.

Naturally, if data are available, they can be used to provide parameter estimates using a variety of techniques, but the methodology is robust enough to cope with fairly rough estimates on the part of SMEs and yet still provide robust representations of relative contribution to risk exposure at the enterprise level.

Such interfaces can also show annual loss, as well as where the current assessment sits relative to other assessments in chosen categories.

‍Doug Hubbard likes to tell the joke about two runners out in the woods who come across a bear. The first runner quickly bends down to tighten his shoelaces. The second says "Why bother? You can't outrun a bear", to which the first replies "I don't need to outrun the bear. I just need to outrun you."

We don't need to outrun the bear. We just need to outrun risk matrices.

Risk matrices leave the bar incredibly low in terms of risk assessment in the context of Enterprise Risk Management Programs. Even a very simple fully prescribed model, fed with crude SME estimates is a vast improvement. It requires little or no expertise on the part of SMEs, just a little explanation of what the parameters mean and how to set them, and it requires very little modification of a well-structured ERM program.

This is not a replacement for using modeling to provide decision support for individual risks and, indeed, it provides a useful framework for capturing the results of those calculations. But this is a replacement – a straight swap in fact – for meaningless madness with matrices. If you're engaged in this kind of program, just a small modification in your assessment practice can turn a compliance obligation into a massive value opportunity.

GRC Report is your premier destination for the latest in governance, risk, and compliance news. As your reliable source for comprehensive coverage, we ensure you stay informed and ready to navigate the dynamic landscape of GRC. Beyond being a news source, the GRC Report represents a thriving community of professionals who, like you, are dedicated to GRC excellence. Explore our insightful articles and breaking news, and actively participate in the conversation to enhance your GRC journey.

Oops! Something went wrong

Emerging from the Muddle of Matrices

Deep Dive

The Shift from Risk Matrices to Actionable Models

Sponsored by

Austria’s Financial Sector Thrives Despite Global Turmoil, Says FMA 2024 Report

FERC Unveils Flurry of Decisions on Energy Projects, Cybersecurity, & Market Oversight

European Regulators Call for Tougher MiCA Rules to Strengthen Supervision & Cyber Resilience