The rise of quantitative investing
With the dramatic increase in computation power available in recent years, quantitative methods are gaining momentum in the finance world. The results, however, are mixed. The Renaissance Fund, founded by brilliant mathematician James Simons, has produced an average annual return of 35%, after fees, over a period of 25 years. Yet other quantitative funds have failed, sometimes miserably. Solid, mathematically-driven investment methods are as profitable as they are scarce!
The public rarely learns about the highly successful funds or has the opportunity to invest in them. Unfortunately, the void between the public and highly scientific investment operations is filled by others who promote investment products with a patina of sophisticated mathematics, but (intentionally or not), not really based on fully rigorous scientific-statistical methodology.
Danger ahead: backtest overfitting
One of the mostly widely used (and widely misunderstood) experimental techniques for justifying quantitative investment methods is backtesting: using historical data to simulate how the proposed strategy would have performed in the past. Although the idea of backtesting is simple, and its usage is essential in finance, its correct implementation requires advanced knowledge of mathematics and statistics. If one is not careful, backtests can easily lead to overfitting.
Overfitting can easily occur if one tries hundreds, thousands or even millions of different combinations of parameters for a strategy (as is possible now with advanced computer technology), selecting only the best. The resulting scheme is then typically optimized only for a particular set of securities over a particular period in history, and thus often leads to disappointing performance when implemented. Indeed, backtest overfitting is arguably the most common reason that financial schemes which look great on paper fall flat in the real world.
A simple and oft-cited example is the following. Suppose an investment advisor sends out letters to 10,240 (= 10 x 210) prospective clients, with half of the letters predicting a security will go up, and with half predicting it will go down. One month later, the advisor sends out another batch of letters to the 5,120 clients who had earlier been sent the correct prediction, again with half of the letters predicting some security will go up, and half predicting it will go down. After ten repetitions, the remaining ten investors, amazed by the advisor’s uncanny spot-on predictions for ten straight months, might allocate all of their money to her for investment. Such a scheme is both absurd and misleading, since the ten remaining investors are never told of the many other failed predictions.
Similarly, when the numbers of trials, models and backtests used to construct an investment strategy are not disclosed, investors, fund executives and even academics reading published articles have no way to assess how successful the strategy will be in practice.
Indeed, by all indications, this general problem appears widespread in the field of finance. Many speakers and financial columnists employ charts, graphs and predictions not based on rigorous statistical methods. Even academic financial journals often publish articles that include backtests of investment strategies, without requiring that the authors report the number of models, trials and tests involved in the experiment, thus conferring a “seal of approval” to investment ideas of dubious efficacy.
Our papers on backtest overfitting
In a paper Pseudo-mathematics and financial charlatanism, written by us and appearing in the May 2014 issue of the Notices of the American Mathematical Society, we analyze backtest overfitting in detail. We derive formulas and results that show that one can achieve almost any desired Sharpe ratio (a standard measure of performance) if one explores sufficiently many parameters or variations of a strategy, or does not backtest over a sufficiently large historical dataset. We further show that overfitted strategies are not only likely to disappoint, but, in the presence of memory (as real markets possess), they are actually prone to lose money.
We study backtesting in even greater technical detail in a follow-on paper The probability of backtest overfitting.
Overfitting and reproducibility in modern science
The problem of overfitting can be seen as just one instance in an increasing awareness in the larger world of scientific research of the need for rigor and reproducibility.
For example, a March 2014 study published in Nature Neuroscience found that more than 50% of 314 articles appearing during an 18-month period “failed to take adequate measures to ensure that statistically significant study results were not, in fact, erroneous.”
Along this line, there is a growing consensus in the pharmaceutical industry that the common practice of publicly releasing only the results of highly successful clinical trials inherently introduces a bias into the field. As a result, a movement within the industry would require firms to publicly disclose the results of all clinical trials — see the http://www.alltrials.net site. Johnson and Johnson has already announced that they will do this for their products.
Similarly, in the scientific and mathematical computing community, there is a growing concern that measures be taken to ensure that computed results are numerically meaningful and reproducible by independent researchers. See this report for details.
Why the silence?
Historically scientists have led the way in exposing those who utilize pseudoscience to extract commercial benefit. In the 18th century, physicists and chemists exposed the nonsense of astrologers and alchemists.
Yet mathematicians in the 20th and 21st centuries have remained disappointingly silent with regards to those in the financial community who, knowingly or not,
- Fail to disclose the number of models that were used to develop a scheme (i.e., overfitting).
- Make vague predictions that do not permit rigorous testing and falsification.
- Misuse charts and graphs: See our previous blog on the “scary chart.”
- Misuse probability theory, statistics and stochastic calculus.
- Misuse technical jargon (“stochastic oscillators,” “Fibonacci ratios,” “cycles,” “Elliot wave,” “Golden ratio,” “parabolic SAR,” “pivot point,” “momentum,” etc.).
Our silence is consent, making us accomplices in these abuses.
This blog and website were established with these concerns in mind. Nonetheless, our approach here is not one of confrontation, but instead one of research to better understand and mitigate these difficulties, education to assist other professionals in the field, together with unbiased testing and analysis. So if you identify with our concerns, let us know and spread the word. Together we can make a difference.