Tempered Expectations: The Hope and Reality of AI in Risk Management
Models driven by artificial intelligence have not yet lived up to the hype, particularly with respect to probability of default estimation. But optimism for the future is reasonable.
Friday, February 12, 2021
By Marco Folpmers
A couple of years ago, risk professionals had great expectations about artificial intelligence (AI). But the expected paradigm shift hasn't yet occurred - at least not for probability of default (PD) modeling.
The general belief was that large amounts of available data would be seamlessly fed into a machine-learning engine, which would produce near-perfect assessments of risk - e.g., for PD. Data points based on transactional data, commercially-provided behavioral client data and even social media data, moreover, were supposed to supplement small datasets of financial (e.g., income/past behavior) data.
But a few obstacles have scaled back the “great expectations” for AI-driven PD models. The accessibility and integration of large datasets (in addition to what's already in place at banks) have been hampered by many practical IT problems, as well as regulatory and ethical concerns.
What's more, the explainability of these models remains an issue, and their forecasting prowess has also been called into question. Indeed, one of the areas in which the tech-fueled models were supposed to hold a distinct advantage was in their ability to forecast losses accurately. But the gains in terms of the predictive power of AI models (such as neural networks, support-vector machines and random forests) have so far not been significant.
When attempting to figure out which type of PD model has the best predictive power, the area under the curve (AUC) - a performance metric for models - is vital. The performance of machine-learning models was covered in a 2020 study by Joseph Breeden that evaluated the pros and cons of different predictive modeling techniques. It featured an AUC comparison across more traditional (benchmark) logistic-regression models and models that rely on ML techniques.
A key takeaway from the study is that “refined logistics regression” - i.e., a traditional logistic regression model that uses optimized independent variables - currently holds an AUC edge over machine-learning models. But the study also shows that machine-learning models that can effectively handle non-linearities in the data are, overall, better equipped for PD modeling than the more traditional models.
The march toward machine-learning technology, as we've previously described, has been slow, but it is expected that in the near future predictive modeling techniques will dominate the credit risk modeling landscape. Indeed, in a recent Bank of England survey, 50 percent of UK-bank participants indicated that they expect the importance of machine-learning models to increase in the near future.
To be prepared for this major shift, one best practice is to set up a “lab environment.” In such a lab, a separate group of risk and coding specialists can experiment with new types of models outside of the formal hierarchy of regulatory risk modeling.
One question that is often asked today is whether such a modeling lab should be part of first or second line of defense. The answer is neither! In this type of environment, first- and second-line professionals must work together to develop a better comprehension of these new types of models.
To help credit risk professionals understand which types of models are the most impactful, further empirical studies on comparative PD modeling performance are needed. As more information (e.g., social media data) and results become available over time, machine learning will inevitably dominate the credit risk modeling landscape.
Dr. Marco Folpmers (FRM) is a partner for Financial Risk Management at Deloitte Netherlands and a professor of financial risk management at Tilburg University/TIAS.