Will ChatGPT Break Financial Markets?

Large language models (LLMs), such as ChatGPT, are threatening to disrupt most areas of life and work. Financial trading is no exception.

Earlier versions of machine learning (ML) and artificial intelligence (AI) have not been notably successful at trading. They have been used extensively to decide how to execute trades, but not to decide which trades to make. The basic problem is that financial prices are nearly all noise; indeed, they are very close to random walks.

Aaron Brown

Lots of smart people and algorithms conspire to eliminate any signal that can be used for profit. Deciphering financial prices is like trying to understand text that is deliberately written to be misleading. Traditional AI is more successful when signals are stronger relative to noise.

Before getting to what is different about modern LLMs, I’ll tell you why risk managers must care, even if they oversee no computerized financial trading. Trading is the foundation of finance. Even small changes in mechanisms exert profound effects on market, which translate into profound economic consequences.

High-frequency trading (HFT) is a good example. Introduced in the late 1990s, HFT didn’t just link end-buyers and end-sellers more quickly and efficiently. It vastly increased trading volumes, lowered transaction costs and knocked humans out of the equity-trading business. It required fundamental re-engineering of financial regulation, and also led to zero-commission brokerages and zero-fee index funds — eliminating the revenues that brokers and asset managers had relied upon since they were first created.

What’s more, in addition to restructuring two major financial businesses, challenging regulators and cutting costs to end-investors, HFT gave us new phenomena, like flash crashes.

In the last half-century, other trading innovations have had similarly broad effects. The introduction of public futures and options traded on financial instruments in 1973 created the modern global derivatives economy, with vastly expanded leverage outside of the banking system that was difficult for regulators to monitor or control.

Program trading in the 1980s was blamed for the largest stock market crash in history in 1987, and has played a part in exaggerating every bubble and crash since. Mortgage-backed securities changed banking, Wall Street and home-buying. In the 21^st century, we’ve seen dramatic effects from credit default swaps, collateralized debt obligations and exchange-traded funds.

The Rise of LLMs in Trading: Beneficial or Detrimental?

The question for risk managers today is whether LLM trading is likely to be stabilizing or destabilizing. Humans and older AI algorithms tend to focus on trends that have obvious profit implications — e.g., stocks that went up yesterday are more likely to go up today than stocks that went down yesterday. They also zero in on factors with clear economic rationales, such as the relation between oil inventories and oil prices.

But this is a negligible fraction of all possible relations. LLM algorithms can search across any markets, for any sort of anomalies in financial data — like prices, volumes, fundamental data, volatilities or correlations. LLM models used in trading could turn up huge numbers of puzzling relations without obvious profit implications or economic links, and it might be true that combinations of those could support profitable trades.

The key breakthrough that may lead to LLMs succeeding in quantitative trading, where earlier efforts failed, was described in a seminal 2017 paper by Google researchers, “Attention is All You Need.”

The scientist and science-fiction writer Isaac Asimov, wrote, “The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka’ but ‘That's funny...’” Insight, in other words, comes not from confirming or rejecting hypotheses, but from noticing things you ignored in the past.

Conventional science proceeds with specialists asking and answering questions that are known in advance to be interesting, given the state of prior knowledge. But imagine an alternative “Journal of That’s Funny” – one that lists puzzling observations from all fields, without filtering out the ones that seemed unimportant.

People might notice two or three of these puzzles, combine them with something they learned for themselves, and come up with dramatic cross-disciplinary discoveries. While this might be a colossal waste of time for publish-or-perish academics, computers have nothing but time to correlate millions of “that’s funny” facts – details that are too unimportant individually to interest humans.

The Google paper suggested that AI should spend less effort figuring out which funny facts were important, and more time correlating all of them. This attitude is familiar to fans of detective fiction, where the hero ponders over minor inconsistencies and irrelevancies that only reveal the murderer when assembled in sequence — while the unimaginative assistant or professionals insist on paying attention only to the facts known to be important.

A plausible near-future story is LLM-flavored trading models will build large cross-asset-class portfolios. This would be similar to a strategy currently employed by global macro hedge funds, but with more leverage, more positions, more active trading and, importantly, no human to explain the thesis. (There might be an explainer module added that will give plausible-sounding theses, but there’s little reason to believe these explanations will have any relation to the reason for the positions.)

We can hope that the new price relations and correlations will better reflect economic reality, leading to more efficient allocation of capital and better real economic decisions. But hope is a poor risk management strategy. Even in the aforementioned case, the restructuring of cross-market financial relations will disrupt many business models and regulatory regimes. There could be at least as much disruption as we got from HFT, and perhaps as much as we got from public trading of financial futures and options.

Of course, there are lots of things that might happen, and LLMs like ChatGPT may not succeed in quantitative trading, or they may not have the effects I anticipate. But they deserve special attention due to the possible positive feedback.

If LLMs do disrupt the correlation structure of asset markets, the confusion will likely increase the number of “that’s funny” relations to pay attention to, and therefore the raw material for LLM trading strategies. I see no reason to assume this will evolve into a stable or good equilibrium; it could, in fact, get more and more chaotic until things fall apart.

The limiting factor on LLM effects on trading is the amount of cross-market leverage available. Traditionally, high leverage has been available only on low-risk assets or positions within a single market. You could, say, reduce the margin requirement on a portfolio of stocks by adding calibrated short positions on other stocks, or on a futures position in crude oil with an opposite position in heating oil; that would, however, not be possible with a bond position with a stock option position. Cross-market correlations have in the past been considered too unstable to affect margin decisions.

One alternative to LLM trading is “VaR margining,” which allows leverage based on the estimated tail risk of a portfolio, even if it contains positions in different asset classes. However, while I am a big fan of VaR, I would not rely on it for the kinds of many-position, cross-market portfolios I think LLM trading models might produce.

The Evolution of AI – From Discriminant Algorithms to Momentum Trading

LLMs are not wholly new – they combine components used in other AI and ML applications, like autoregression, pattern matching and discrimination. These components largely try to mimic what humans do, and are embedded in many existing quantitative trading algorithms.

Consider a discriminant algorithm, like one that divides stocks into overvalued or undervalued – or one that separates bonds into those with low or high risks of default. Whether developed by humans or computers, these tend to stabilize markets.

If traders buy undervalued stocks and short overvalued ones, valuations will come better into line. If the algorithm is good, its profits will disappear. If the algorithm is bad, the people trading with it will stop — either because they’re losing money or because they’ve run out of money. Either way, the effect of a discriminant algorithm on other market participants is limited.

Compare that to an autoregression algorithm like momentum trading — which buys things that have gone up in price recently and shorts things that have gone down in price recently. This is destabilizing.

Momentum traders push up the prices of things already going up, making them even more attractive momentum plays. When prices fall, momentum traders pile in to make them fall faster, drawing in more momentum traders. Crucially, this happens whether the momentum algorithm is right or wrong. There’s no internal limit to the damage momentum traders can do to a market, although eventually bubbles always pop, and people find ways to restart things after a panic.

A liquidity transaction is another important example of a destabilizing, positive feedback trade. Shorting newly issued U.S. 30-year treasury bonds and buying the previous issue – which has 29.75 remaining years to maturity – is a famous example of such a trade. Although the securities are nearly identical economically, the newly-issued bond is far more liquid, and generally carries a higher price. (There are similar trades in many other markets, where traders short liquid securities and buy nearly identical but less liquid ones.)

The effect in treasuries is that the available supply of 29.75-year bonds is snapped up and held by traders, making that bond even less liquid, while lots of virtual 30-year bonds are created by the short sales, making those even more liquid. This increases the price differential, making the trade even more attractive and causing more traders to engage.

On the other hand, amid a liquidity crisis, or even a liquidity tremor, the price differentials in all markets rapidly reverse. Liquidity traders are forced under such a scenario to rush for the exits, exacerbating the liquidity crisis and often bankrupting institutions and threatening severe contagion of financial distress.

Parting Thoughts

Over the years, we’ve seen AI’s potential to act as a stabilizer or a destabilizer in trading. The specific impact of LLMs like ChatGPT in this space has yet to be determined. However, specific speculations aside, risk managers should pay attention to the adoption of LLM trading strategies due to their potential to increase cross-asset leverage and to disrupt markets.

LLM models have the ability to improve many things both in finance and in general, but they do introduce new possibilities for disruption and instability. As with many financial dangers, it is the combination of novel trading methods with excessive leverage that leads to disaster. If leverage is restricted to prudent levels, the damage of bad innovation can be limited.

Aaron Brown worked on Wall Street since the early 1980s as a trader, portfolio manager, head of mortgage securities and risk manager for several global financial institutions. Most recently he served for 10 years as chief risk officer of the large hedge fund AQR Capital Management. He was named the 2011 GARP Risk Manager of the Year. His books on risk management include The Poker Face of Wall Street, Red-Blooded Risk, Financial Risk Management for Dummies and A World of Chance (with Reuven and Gabriel Brenner). He currently teaches finance and mathematics as an adjunct and writes columns for Bloomberg.

Topics: Risks & Risk Factors

2025 FRM Candidate Guide

2025 SCR Candidate Guide

2024-2025 RAI Candidate Guide

2024 Risk Careers Survey: Global Report

Article

Will ChatGPT Break Financial Markets?

Share

Trending