Novel Deep Learning Methods For Stock Market Prediction

Due to the enormous amount of available data and an ever changing environment, machine learning is a natural fit for the stock market. Many classical ML methods have been applied to financial data looking for ways to give traders even a small edge. With the increased popularity of Deep Learning, many new methods are being developed for numerous applications. Often with a little creativity, these methods can be used in financial applications.

In this article I’m going to give an overview of two papers I found interesting that take DL methods and use them in a context.

Deep Factor Model

The first paper is Deep Factor Model authored by Kei Nakagawa et al. In addition to providing an example of a straightforward way to use Deep Learning to predict stock market returns, this paper also demonstrates a method for explaining the predictions of you models.

One of the primary benefits of deep learning is that it allows the modeling of complex and non-linear functions. Many classical machine learning methods are linear and can’t capture the more complex relationships between different features. As computer hardware has improved enough to allow very fast matrix operations, Deep Learning has become the best tool for many different tasks. So while deep learning can provide an enormous advantage in terms of what it can model, he downside is that it isn’t transparent or interpretable. Explaining the different weights that have been assigned to neuron inputs in a large neural network doesn’t provide useful information to a human. This is a problem if you are a bank or a hedge fund and you need to be able to justify the investments you have made to regulators or customers. Imagine if I asked you for $10,000 to invest in an equity and when you asked why, all I told you was that I had a model that I couldn’t explain to you that predicted that it was the best choice. Hopefully, you would be skeptical.

The actual models used are pretty straightforward. They experiment with two different models, both composed of dense layers, one has hidden layers with sizes 80–50–10 and the other is 80–80–50–50–10–10. They compare the results of these models to Linear Regression, Support Vector Regression(SVR) and Random Forests.

The data is made ups of 16 different factors for equities in the TOPIX index on the Tokyo Stock Exchange. Each feature comes from one of the following categories: Risk, Quality, Momentum, Values and Size. Each sample is made up of the values of each factor at the start of the month and the returns over that month. To predict the return value of an equity over the following month, the model is trained on the previous 60 training samples(5 years) and then a prediction is made on the current month. So to find the accuracy of the model(and the returns) a sliding window moves over the target dates where the model is trained, a prediction is made, the window moves, a new model is trained and so on.

Two metrics are used to evaluate the models. The first is the prediction accuracy and the second is the profitability of the model over a 10 year period. The profitability is used because it evaluate the real world application of the model. If you had a model that had high accuracy for most cases but missed many of the most profitable opportunities it wouldn’t be very useful. A long short trading strategy is employed where the equites are ranked and the equities in the top quintile are bought and the equities in the bottom quintile are sold.

Model 1 is the 80–50–10 DL model and Model 2 is the 80–80–50–50–10–10 model.

You can see some of their results above. Both Deep Factor Models outperform the other models for the average returns, Sharpe Ratio, MAE and RMSE. In non-financial terms this means that the Deep Factor Models made more money and offered better returns for the risk. While they don’t compare it to any other trading strategies commonly used in finance, you can see the benefits compared to other ML methods.

What I thought was really interesting was that they show why their models made the decision they did using a method called Layerwise Relevance Propagation. This method was actually very simple but provided useful insight into what factors were important to their model.

The same image can be found in the paper.

Using the network above, Layerwise Relevance Propagation works as follows:

The relevance of any output neuron is it’s output. Working back through the network, relevance is calculated with the following formula:

Looking at the relevance of neuron R4, w46 is the learned weight connecting the output of R4 to R6 and z4 is the output of R4. The rest of the notation is the same but for the respective nodes. In simple terms, the percentage of the input provided by a neuron to the neuron in the next layer is the neurons relevance. In the table below you can see the relevance of all of the neurons in the graph above. By propagating backwards through the network, you can find the percentage contribution of each factor to each output neuron for each prediction.

Each Relevance is just a percentage of the contribution to the output so each layer will have the same total relevance which will sum to the value of the output neuron you are analyzing.

(It’s worth noting this is a different method than in LIME for finding which pixels led to image classification. LIME changes the value of pixels in the image to see which have the greatest affect on the predicted class.)

For analysis, the factors were grouped into categories and the contribution of each category was found. Below you can see the contribution of the factors to the predictions and that some were much more important than others. This is averaged over all of the used equities but for a real world situation and you can immediately see that most important factor categories are Quality, Value and Risk. Breaking it down even farther would allow you to tell a potential client exactly what indicators you were using to invest their money.

Time Series to Image Conversion

The second paper I wanted to discuss is titled Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach and as the name implies, they take financial time series data and transform it into an image.

Not very meaningful to the human eye.

For training data, the raw data is made up of 15 different daily financial features for 30 different DJIA equites and the daily opening equity price. A label of either buy, sell or hold are then assigned. Buy labels are assigned if that day is the low price of an 11 day window centered on that day. Sell labels are assigned if it is the high price in the 11 day window. All other days are labeled hold. Then for each day, the training data is built by taking a 15 day window starting from the sample day and going back 14 days and taking the matrix of days and features, normalized each feature to fall between 0 and 1 for the training period and assigning the normalized feature values as pixel values. The result is a 15x15 greyscale image. For example, the fifth feature on the sixth day would be pixel[4,5].(with a zero indexed array) You can see an example above.

They then take the labeled images and the label and use them to train the network.

Training is done on a sliding 6 year window which you can see above. They train on the first 5 years and use the last year as validation data and then move everything up a year and continue training the model. There isn’t an explanation for why they do this but it may be because trading models don’t work as well as you go farther into the future as other people discover the patterns your model is using and they become less profitable. Training using a sliding window lets you train your model on the early years and fine tune it on more recent years so you can take advantage of more training data.

The model used is a simple CNN that classifies the images. The architecture can be seen above. They use a 3x3 filter to catch more local detail at the cost of missing wider correlations. This means that the order of the features is important as moving a feature more than 2 columns away means no correlations between the features could be found.

Similar to the first paper, two methods of evaluation are used for testing. First they evaluate if the CNN is making the correct prediction for individual days. You can see above that the accuracy is 58% and in the confusion matrix, the vast majority of the errors are holds being labeled as buy and sell. While higher accuracy would be better, many of the mislabeled days are occurring at points very close to a buy and or sell day so it isn’t as bad as it looks. If the days before and after a peak are getting labeled sell it will still work for a trading strategy, it will just be slightly less profitable. In addition, the day after a peak being labeled a sell won’t ever be acted on as the peak should also have the sell label. In general, buys following buys and sells following sells won’t affect your overall returns so inaccuracy is less bad than it would normal. Some of these mislabeled days could be the result of this not being a balanced training set as 95% of days are holds. They don’t say in the paper if they do anything to balance the training data so that may be one way you could potentially improve upon these results.

To evaluate how the model performs when used to invest, they run the model on a test year using Buy-Sell pairs. You can see some of the results in the table above and the difference in 10 year returns for Travelers Insurance when buying and selling based on the model predictions compared to a buy and hold strategy. For the 28 Stocks evaluated, the Time Series to Image method was the best performer on 19, sometimes by quite a bit, and for three of the other stocks it was very close. In general the model performed well, particularly for how simple it is.


I hope you enjoyed reading about the two papers and the methods they used. Both are great examples of taking something techniques developed for other uses and applying them to finance. It can often be difficult to find real world cutting edge techniques for financial predictions(because the people who develop them use them to trade rather than share them) but finding clever ways to use existing tools is an exciting way to apply Deep Learning to the stock market.