Machine learning is getting better, but has much to learn

Last week, I discussed the tremendous risks of overfitting algorithms to noisy data and the potential to create seemingly profitable investment strategies due to data mining. Because machine learning and artificial intelligence (AI) applications tend to work with extremely large amounts of data, this risk seems particularly prevalent in those fields.

The promise of machine learning and AI is that it can work with unstructured data and discover nonlinear relationships in markets that are ignored by traditional regression-based statistical methods. I am convinced that markets are full of nonlinearities and if we can develop reliable methods to identify and “predict” them, the investment world would make a giant leap forward.

But if your machine learning application is too simple, it will essentially act like a chart-technical analyst who tries to predict the future development of asset prices by identifying patterns in past prices that may or may not be there to begin with. And when this “pattern recognition” approach to investing is left unchecked it can potentially lead to some dangerous outcomes. Take the following anecdote in Jonathan Zittrein’s recent article for the New Yorker:

In 2011, a biologist named Michael Eisen found out, from one of his students, that the least-expensive copy of an otherwise unremarkable used book – “The Making of a Fly: The Genetics of Animal Design” – was available on Amazon for $1.7 million, plus $3.99 shipping. The second-cheapest copy cost $2.1 million. The respective sellers were well established, with thousands of positive reviews between them. When Eisen visited the book’s Amazon page several days in a row, he discovered that the prices were increasing continually, in a regular pattern. Seller A’s price was consistently 99.83 per cent that of Seller B; Seller B’s price was reset, every day, to 127.059 per cent of Seller A’s. Eisensurmisedthat Seller A had a copy of the book, and was seeking to undercut the next-cheapest price. Seller B, meanwhile, didn’t have a copy, and so priced the book higher; if someone purchased it, B could order it, on that customer’s behalf, from A.

Now imagine something like that happening in the stock market. Obviously, this is an extreme example, but market episodes like flash crashes may be due at least to some extent to algorithms trading with each other without being checked by fundamental investors and other market participants. 

In order to make useful predictions about stocks and markets, machine learning programmes cannot just be trained in pattern recognition but need to be trained with some fundamental input about the underlying drivers of the time series. This is one of the key recommendations of the excellent article by Keywan Rasekhschaffe and Robert Jones in the latest edition of the Financial Analysts Journal. 

They stress that machine learning can only hope to become better if users engage in “feature engineering”, i.e. use their knowledge about the fundamentals underlying a time series to provide a framework for the machine learning algorithm within which to search for the best combination of factors to predict the time series. For example, when forecasting overall stock market prices, it is important to introduce general macroeconomic relationships (e.g. the influence of interest rates on stock prices) to the algorithm. When trying to predict individual stock returns, on the other hand, these macroeconomic relationships might be less useful and instead company fundamentals (e.g. corporate leverage ratios) might be used to train the algorithm. Otherwise, the machine learning algorithm can fall into the trap of mistaking correlation for causation. We all heard these stories that the butter price in Thailand “predicts” the S&P 500 and so on. Because the data used by machine learning applications is so large and so intransparent, feature engineering will become an absolute necessity to reduce the likelihood of overfitting the application to noisy data.

Even so, machine learning programmes have a hard time beating traditional statistical approaches like linear regressions. Alaa Sheta and his colleagues have put machine learning algorithms as well as linear regression analysis to the test in 2015 and tried to predict the S&P 500 index. While machine learning algorithms performed better for one-day forecasts, they quickly went out of control and had a bigger prediction error than linear regressions for forecast horizons of several days or more. 

In a more comprehensive exercise, Spyros Makridakis and his collaborators looked at eight machine learning algorithms, two neural network algorithms and eight classical statistical methods to forecast the S&P 500. The chart below is taken from their paper and shows the symmetric Mean Absolute Percentage Error (sMAPE) of all the different algorithms and statistical methods they used. Note that smaller values indicate better forecasts in this chart. All the machine learning methods were dominated by the traditional statistical methods when it came to forecasting the S&P 500. The lesson to be learned here is that at least when it comes to univariate time series, it probably is best to start with a simple statistical method to forecast the time series. Most likely they will be anyway better than machine learning methods.

Forecast error of different algorithms trying to forecast the S&P 500

Source: Makridakis et al. (2018)

However, machine learning methods may have an edge when it comes to forecasting large multivariate time series. In their FAJ paper, Rasekhschaffe and Jones used machine learning algorithms to predict the returns of thousands of US and international stocks. Again, they found that individual machine learning and neural network algorithms often performed no better than an ordinary linear regression. And if the number of variables in the regression approach was reduced with the help of principal component analysis and the like, standard statistical methods could perform almost as well as the most sophisticated neural networks. The difference in performance between statistical methods augmented with principal component analysis and machine learning approaches were small in practice.

However, if different machine learning algorithms are combined to enhance the forecast signal, the advantage over the statistical methods increased. And this is the second recommendation of the article. Don’t just stick with one method to forecast markets. Combine many different methodologies to improve your forecasts. This is no different than what quant investors have always done, but it is worth repeating since so many investors seem to think that machine learning provides the ultimate black box that cannot be improved upon. Instead, machine learning and AI should be seen as just another approach to analyse data. As with every advancement in quantitative finance, it will help us move forward and improve our ability to understand and forecast markets. But these improvements will likely be gradual, rather than the revolution that is so often promised. And like every other advancement in quantitative finance, it will likely lead to disappointing results for investors who believe the hype and optimism of marketers and early adopters. 

Financial markets have a way of reinventing the wheel but every time with new bells and whistles. Machine learning and AI is just another shiny new wheel.