Reflections on the Value of Models

From time to time I revisit my various slide decks on different topics, look at what is past its sell-by date, and what my blind spots may have been in the past. The latest to undergo this treatment was a “futures deck.” One opening slide, to illustrate the purpose of my talk, had a quote with a very long history (25 years) that also felt highly relevant today.

The quote was from the statistician George E. P. Box: “All models are wrong, but some are useful.”

Now, all models, whether quantitative or qualitative (mental), are essential to making sense of the world. My problem is not with models per se, but when people become servants to their particular models. An earlier post of mine arose from an article that claimed that “comparative advantage” was the closest that economics had to a “law of physics.”

Increased polarization around economic forecasts has been a feature of the public discourse over the last few years. Organization X (such as Office for Budget Responsibility [OBR], Treasury, Bank of England) will frequently be lambasted for being too pessimistic by one side and too optimistic by the other.

“Inherently Political”

Yet we all live with the consequences of decisions based on the prognostications of those entrusted to tell us, for instance, when inflation will fall to 2%, or when growth will hit 2%.

For me, the reality is that all economic models are inherently political.

Chris Yapp: Scrutinize the base assumptions.

Let me illustrate with a contemporary example. The U.K. government has not achieved its stated aims on immigration once since 2010. So, if you were charged with making a forecast for GDP growth, what would you use as your base assumption about the size of the U.K. working population?

If your base assumption was that they would miss that target by 10%-50%, which would be a generous margin looking at the last 14 years, then how do you deal with accusations of undermining the government?

On the flip side, if you use the government’s target as an assumption, how can you claim to be independent?

The late Sir Sam Brittan once told me at the start of my career that “when a government changes its measures, the new measures fit their narrative better than the old ones.” He was not a cynical man, but a good observer of the “real world.” So, if you look at GDP in the last few years and compare to GDP per capita over the same period, the government would prefer the former and the opposition the latter.

The Reality of Incompleteness

I recently received another article on “was Einstein wrong?” The tone suggests that the physics-world model created between 1905 and 1915 was about to be overthrown (if only!). I would argue that among the greatest intellectual advances of the 20th century in describing “reality” is “incompleteness.”

Even when special and general relativity were first formulated, it was known that they were incomplete models. We still have not integrated gravity into the quantum world despite many brilliant minds and billions spent. Yet attempts to prove Einstein wrong have been deeply problematic.

If an organization gets its unemployment, inflation or growth forecasts “wrong,” I argue that it is better to focus on the incompleteness of the model and look at what we can learn, rather than at the specific outcomes of that model.

Need for Compromises

Let me turn to a more day-to-day example of models with which we are all familiar: weather forecasts.

If I want to forecast the weather tomorrow, even with the speed and capacity of modern computing, there is a model limit. That means that a model that takes 30 or more hours to run and produce outputs is not valuable as a forecast model. Of course, what was practically achievable 30 years ago was much more limited than now or in the future. Yet our models are still incomplete.

So, in building a forecasting model, compromises need to be made. If there was a giant volcano in Iceland or Indonesia tonight, the impact on weather or climate over days or even years could be highly significant and indeed disruptive to our contemporary understanding of weather and climate systems.

Most people, if asked what was the largest volcano of the 19th century, would probably suggest Krakatoa in 1883. It was actually a much larger explosion, Tambora, in 1815. Krakatoa comes to mind quickly because the telegraph system was installed the year before, so it was the first explosion of massive scale where we knew the origin quickly. Building volcanic activity into the model, rather than treating it as an external factor, is neither realistic nor pragmatic.

If you do not know the Tambora story, see Wikipedia. If you fancy yourself as a writer of Netflix disaster series, rewrite Tambora for 2030 with a global population of 8 billion!

Basis for Evaluation

Although I understand why forecasters of economic outcomes may wish to protect IPR (intellectual property rights) in their models, our collective ability to improve forecasts is hampered. Opinion pollsters have signed up to codes of conduct on their methodologies. Those who have not, I simply ignore.

What I would like to see is what might be called a common template for evaluating differing forecasts.

First, there are the inputs. What are they, from where and how are they sourced?

Second, the modeling process. For example, how does the model evaluate the size of the labor force, including dealing with skill shortages and shifts such as “zero hours” or the gig economy.

Third, the outputs. The challenge here is the sensitivity of the outputs. For instance, if a model has energy prices as input, what range is catered for? We might say that oil is priced between $60 and $100 within the model. An outcome of, say, $40 or $150 would be outside the model range.

Fourth, what factors are external to the model? That is to say that they are inherently unpredictable, computationally complex or long-tail risks.

Finally, I would argue for track record. Where has the model previously been too optimistic or pessimistic?

Evidence and Judgments

Around the start of each year, league tables are often published comparing the track records of different forecasting bodies for the previous year. The value of these league tables could be enhanced if there was a common reporting regime that would allow those in the sector to learn from each other, but also allow users of the outputs to be better informed.

To return to the weather, where I live there is a hill a few miles to the west, and we can experience very localized conditions. I’ve had days where the weather has changed five times while driving to the nearest motorway junction six miles away.

The weather apps have improved considerably in the last few years. Providing probabilities of rain by the hour at postal code level is useful for maintaining our garden. The number of times that the forecasts are very wrong are quite few. After a forecast of 90% rain for much of the day, when the outcome was dry all day, I tried an experiment and looked at the forecast for the neighboring westerly postcode. Now, when the two diverge I can make my judgment with some supporting evidence.

I wish that we could do that with the outputs of economic forecasts. I don’t think we can get them “right” all the time, but we could do better than we do now. For those with a mortgage, or businesses facing higher interest rates, even modest improvements could be very helpful.

Simply attacking the OBR, for instance, does nothing to define areas to focus on model improvement, which I believe is both needed and doable.

Having started with E.P. Box, let me finish with Wittgenstein: “Pictures depict. Representations represent.”

You may need a stiff drink. Cheers!

Chris Yapp (chris_yapp@hotmail.co.uk) is a technology and policy futurist, an independent consultant and advisor, and a member of the International Futures Forum IFF Clan. He has worked in the IT sector for companies including Honeywell, ICL (now Fujitsu), HP and Cap Gemini, and was head of public sector innovation for Microsoft UK. A version of this article appeared previously on the Long Finance Pamphleteers Blog, for which Mr. Yapp writes on the future of the financial system.

Topics: Model Risk

2025 FRM Candidate Guide

2025 SCR Candidate Guide

2024-2025 RAI Candidate Guide

2024 Risk Careers Survey: Global Report

Article

Reflections on the Value of Models

“Inherently Political”

The Reality of Incompleteness

Need for Compromises

Basis for Evaluation

Evidence and Judgments

Share

Trending

GARP 2026 Financial Risk Symposium

March 4

Chatbots Can Go Awry. Is There an Insurance Hedge for That?

October 10

Balancing Innovation and Accountability Webcast Series

November 12