An interview with Marcos Lopez de Prado
Cornell University’s Professor Marcos Lopez de Prado is a recognised authority in machine learning. In addition to his outstanding academic career, for the past 20 years he has managed multibillion-dollar funds at some of the largest and most successful asset managers. He founded True Positive Technologies in 2019, with the proceeds from his sale of several patents to AQR Capital Management, where he was a Partner and its first Head of Machine Learning. Cambridge University Press is about to release his new book, “Machine Learning for Asset Managers”, and we asked him to give us a preview.
Whatever edge you aspire to gain in finance, it can only be justified in terms of someone else making a systematic mistake from which you benefit. Without a testable theory that explains your edge, the odds are that you do not have an edge at all. A historical simulation of an investment strategy’s performance (backtest) is not a theory; it is a (likely unrealistic) simulation of a past that never happened. (You did not deploy that strategy years ago, that is why you are backtesting it!)
Only a theory can pin down the clear cause-effect mechanism that allows you to extract profits against the collective wisdom of the crowds; a testable theory that explains factual evidence as well as counterfactual cases (X implies Y, and the absence of Y implies the absence of X). Consequently, asset managers should focus their efforts on researching theories, not backtesting trading rules. Machine learning (ML) is a powerful tool for building financial theories, and the main goal of my new book is to introduce readers to essential techniques that they will need in that endeavour.
As I explained in my previous book, “Advances in Financial Machine Learning” (Wiley, 2018), backtesting is not a research tool. It is common for quantitative asset managers to confound research with backtesting. A backtest cannot prove a theory. A backtest only estimates how much a trading rule profits from an observed pattern, but it does not tell us whether the pattern is the result of signal or the result of noise. The strategy could be profiting from a statistical fluke. To answer that question, we need a theory that can be directly tested with greater depth than a mere historical simulation of a trading rule.
Academic journals are filled with papers where researchers backtest a strategy on decades (sometimes centuries!) of data, and present those results as evidence that a particular investment strategy works. Authors almost never control for selection bias, and in the absence of a theory we must assume that those findings are false, due to multiple testing.
First, authors must state their theories in clear terms. A strategy is not a theory. A strategy is an algorithm for monetising the patterns that presumably arise from a theory. For example, consider a theory that partly explains volatility as the result of market makers widening their bid-ask spreads in response to imbalanced order flow. A strategy may buy straddles whenever the order flow becomes imbalanced. Even if the strategy is profitable according to a backtest, it does not prove that the patterns are due to signal. Only a theory can establish the mechanism that causes the patterns that the strategy is presumably profiting from. Testing the theory involves evaluating its ultimate and inescapable implications. Following the previous example, we could analyse FIX messages in search of evidence that market makers widen their bid-ask spreads in response to imbalaced order flow. We could also evaluate the profits of market makers who didn’t widen their bid-ask spread under extreme order imbalance. Furthermore, we could survey market makers and ask them directly whether their response to imbalanced order flow is to withdraw from the market, and so on. In other words, testing the theory that justifies the strategy has little to do with backtesting. It has to do with the investigative task of defining the cause-effect mechanism.
Second, once the plausibility of a theory has been established, and only then, we should backtest the strategy proposed to monetise the theory. Remember, a backtest is merely a technique to assess the profitability of a trading rule. In the absence of that theory, a backtest is a data mining exercise that proves nothing. Surprisingly, much of the factor investing literature suffers from this lack of rigour. To this day, there is no strong theoretical justification for most factors, even though investors have poured hundreds of billions of dollars on them. ML can help build that economic rationale, as explained in my new book.
Black-boxes cannot predict a black swan, because a model cannot predict an outcome that has never been observed before. Only a theory can do that. A theory must be general enough to explain particular cases, even if those cases are black swans. For instance, the existence of black holes was predicted by the theory of General Relativity more than five decades before the first one was observed. Black swans are extreme instances of everyday phenomena. In the earlier example, market microstructure theory explains how market makers react to order flow imbalance, leading to heightened volatility. The flash crash of May 6 2010 was a black swan, however its microstructure was predicted by the O’Hara-Easley PIN theory, going back to 1996. In conclusion, quantitative models are useful as long as they are supported by validated theories.
ML methods decouple the specification search from the variable search. What this means is that ML algorithms find what variables are involved in a phenomenon irrespective of the model’s specification. Once we know which variables are important, we can formulate a theory that binds them.
This is an extremely powerful property that classical statistical methods (e.g., econometrics) lack. A p-value may be high for an important variable because the researchers assumed the wrong specification, leading to a false negative. Given how complex financial phenomena are, the chances that economists can guess a priori the right specification are slim. ML is the tool of choice in most scientific disciplines, and it is time economists modernise their empirical toolkit.
On the contrary, classical statistical methods are more likely to overfit, because they derive their estimation errors in-sample: The same observations used to train the model are also used to evaluate its accuracy. The reason for classical methods’ reliance on in-sample error estimates is that these methods predate the advent of computers. In contrast, ML methods apply a variety of numerical approaches to prevent overfitting: cross-validation, regularisation, ensembles, etc.
They may be too short for some deep neural networks, but there are plenty of ML algorithms that make a more effective use of the data than the classical statistical methods. For example, the random forest algorithm tends to perform better than logistic regression, even on small datasets, among several reasons because it is more robust to outliers and missing data.
Twenty years ago, one could extract alpha using Excel, like most factor investment strategies attempt. You would rank descending stocks by P/E ratios, buy the bottom, and sell the top. Today, those strategies are mostly dead, as a result of crowding and backtest overfitting. If a strategy is so simple that anyone can implement it, why should anyone assume that there is any alpha left in that pattern?
Whatever alpha is left in the markets, it is more likely to come from the analysis of complex datasets, which require sophisticated ML techniques. I call this microscopic alpha. The good news is that microscopic alpha is much more abundant than macroscopic (unsophisticated) alpha ever was. One reason is, strategies that mine microscopic alpha are very specific (sometimes even security specific), which allows for an heterogenous set of uncorrelated strategies. Another reason is, firms chasing macroscopic alpha are a significant source of microscopic alpha, because the simplicity of their declared strategies makes their actions somewhat predictable. Accordingly, even if the individual Sharpe ratio of microscopic alpha were low, their combined Sharpe ratio can be very high.
RenTec is an example of a firm that has been successful at mining microscopic alpha with the help of ML, and continues to do so consistently, while traditional asset managers have failed to deliver macroscopic alpha. In short, alternative data is an important ingredient for success, in combination with ML and supercomputing.
TPT helps bring asset managers into the Age of AI. We develop customised investment algorithms for institutional investors. My partners and I founded TPT by popular demand. In less than one year, we have been engaged by firms with a combined AUM that exceeds $1 trillion. This reception has surpassed our wildest expectations, so we are very pleased with the industry’s desire to modernise.
I’m more concerned about the state of affairs in academia. Economics students should be exposed to modern statistical methods, following the trail of students from other disciplines. The study of basic econometrics should be complemented with advanced courses in ML. The complexity of alternative datasets is beyond the grasp of econometrics, and I fear that students are only being trained to model (mostly irrelevant) structured data.