Verifying LSTM Stock Price Prediction Effectiveness Using TQuant Lab (Part 2)

LSTM Stock Price Prediction
Photo by Alina Grubnyak on Unsplash

Summary of Key Points

  • Article Difficulty: ★★★★★
  • Combining fundamental, technical, and sentiment data for LSTM stock price prediction and backtesting performance.
  • Reading Recommendation: This article uses RNN architecture for time series forecasting. A basic understanding of time series or deep learning is recommended. For deeper insights into LSTM model construction, refer to  [Data Science] LSTM.

Introduction

In the first article—Verifying LSTM Stock Price Prediction Effectiveness Using TQuant Lab (Part 1)—we compared the predicted data with the actual data to conduct an initial evaluation of the performance of two trained models (for stocks 2618 and 8615). The results were promising. For a more detailed analysis, you can click the link above to learn more, as we will not go into further details here due to space constraints.

Therefore, in this second article, we aim to move beyond theoretical discussions and apply the model to out-of-sample data. Based on the prediction results, we will determine entry points and test whether the actual performance aligns with our expectations.

Editing Environment and Module Requirements

This article utilizes Mac OS and VS Code as the editor.

Applying the LSTM Model to Out-of-Sample Data

The in-sample data for our two LSTM stock price prediction models spans from July 1, 2012, to July 1, 2021. Therefore, to avoid overlap with the training period, the backtesting period will be from January 1, 2021, to June 30, 2024.

Loading External Packages

import os
import time
import tejapi
import talib as ta
from talib import abstract
import numpy as np
import pandas as pd
...

Loading Internal Packages

The ML_stock() class is custom-built for pre-processing data. It handles key tasks such as loading the API_KEY, price-volume data, fundamental data, and technical indicators. Finally, it sets the start and end dates for the backtesting sample. Additionally, we define the model variable to load the pre-trained model for use.

Note: A friendly reminder to input your own API_KEY in the config.ini file before use to ensure everything works smoothly!

Next, we retain only the necessary features to complete the LSTM stock price prediction data preprocessing.

LSTM Stock Price Prediction
data preprocessing

Creating Time Series Data

As we did during training, we will convert the data into a time series format. For detailed steps, refer to the previous article. Once the conversion is complete, we will use the predict function to apply the model for forecasting.

prediction = model.predict(X)
LSTM Stock Price Prediction
LSTM Stock Price Prediction Results for 2618
LSTM Stock Price Prediction
LSTM Stock Price Prediction Results for 8215

We can observe that the LSTM model, trained on data from the past ten years up to 2022, shows a prediction performance that is largely consistent with actual stock prices after 2022. Aside from the model’s difficulty in fully capturing the magnitude of price fluctuations—something already noted during validation—the overall results are satisfactory.

Importing Prediction Results into the Pipeline

The CustomDataset allows for the integration of database content into the Pipeline, facilitating future backtesting. In this example, we use it to import the predicted values recorded in the Pred column into the Pipeline. Below is an excerpt of the code:

LSTM Stock Price Prediction
Prediction Results into the Pipeline—2618

Creating the Pipeline Function

Since the LSTM Stock Price Prediction model only predicts the next day’s closing price, determining the exact entry points, timing, and conditions requires more detailed configuration. To achieve this, we need to design custom factors.

Creating Custom Factors

The CustomFactor allows users to design their own custom factors as needed. In this case, we use it to handle:

  • Daily return relative to the previous trading day (Return)
  • Average True Range (ATR)
LSTM Stock Price Prediction
Pipeline Output—2618

Creating the initialize Function

The initialize() function is used to define the daily trading environment before the start of each trading session. In this case, we configure:

  • Slippage costs
  • Transaction fee model for the Taiwan stock market
  • The Weighted Return Index (IR0001) as the market benchmark
  • Integrate the strategy factors designed in the Pipeline into the trading process
  • Set the context.stop_loss variable to record the stop-loss point during backtesting
  • Set the context.last_price variable to track the last buy/sell price for monitoring stop-losses

Creating the handle_data Function

The handle_data() function is crucial for building the LSTM stock price prediction strategy, as it is called daily during the backtest. Its primary tasks include setting the trading strategy, placing orders, and recording trade information.

For detailed trading rules of this strategy, please refer to backtest_2618.ipynb / backtest_8215.ipynb.

        if return_pred < 0 and cash_position >= 0 and returns < 0:
            order_percent(i , 0.48)
            buy = True
            record(
                **{
                    f'buy_{sym}':buy
                }
            )
            context.stop_loss = price - atr * 1.25

In this example, we use return_pred to record predicted stock prices and returns for actual price movements. If both trends align positively, we enter a position. Exiting the position is based on a combination of take-profit and trailing stop-loss mechanisms. This strategy only backtests long positions for single-sided entry. Those interested can explore an LSTM stock price prediction strategy that incorporates both long and short positions.

建立 analyze 函式

The analyze() function helps us generate custom charts. In this example, we use analyze() to observe the differences between predicted and actual stock prices, visualize the entry and exit points of the strategy, and monitor capital utilization.

For detailed trading rules of this strategy, please refer to backtest_2618.ipynb / backtest_8215.ipynb.

Start Building Portfolios That Outperform the Market!

Running the LSTM Stock Price Prediction Strategy

Use the run_algorithm() function to execute the LSTM stock price prediction strategy configured above. Set the trading period from start_dt (2021-01-01) to end_dt (2024-06-28), and import the custom_loader. The dataset used is tquant, with an initial capital of 1 million TWD. The output, results, will include the daily performance metrics and detailed transaction records.

LSTM Stock Price Prediction
Analyze Charts – 2618
LSTM Stock Price Prediction
Analyze Charts – 8215

The two charts above demonstrate that the strategy achieved strong performance. Specifically, the backtest for stock 2618 consistently outperformed the market throughout the entire period, while stock 8215 initially led the market but was slightly overtaken towards the end.
Note: In the second small chart, the blue line represents the predicted stock price, the red line shows the actual stock price, red triangles indicate buy signals, and green triangles indicate sell signals.

Performance Evaluation Using Pyfolio

LSTM Stock Price Prediction
LSTM Stock Price Prediction Backtest Performance vs. Market – 2618
LSTM Stock Price Prediction
LSTM Stock Price Prediction Backtest Performance vs. Market – 8215

Conclusion

In the backtests for the two stocks above, we can see that the LSTM stock price prediction strategy yielded promising results. Both backtests achieved a Sharpe ratio above 0.5, and the Alpha values were also at a favorable level. Notably, the backtest for stock 2618 was particularly impressive, with an annualized return of 29.6% over 40 months and a cumulative return of nearly 138%.

Looking more closely at the characteristics of the strategy, since it is based on predicting the next day’s closing price as a target for entry, the return curve closely mirrors the actual stock price trend. In other words, when the stock price rises, the strategy rises, and when the price falls, the strategy follows suit. Therefore, controlling stop-loss and take-profit mechanisms is crucial and demands careful attention.

Additionally, during the strategy development process, we noticed that the rule of “entering when the predicted return (return_pred) and the actual return (returns) are both positive” does not universally apply to all models. For some stocks, it may be more profitable to enter when both trends are negative or when there is more divergence between positive and negative trends. Otherwise, the performance of LSTM Stock Price Predictionfluctuations could be significant.

We attribute this issue to the inherent lag in time series models. This lag can result in buying at a peak (since the peak may have occurred the previous day, and the model reflects it only the next day), leading to higher costs, or selling at a low (for the same reason), which can worsen performance. As a result, this strategy requires further adjustment of parameters to find the optimal solution for LSTM Stock Price Prediction.

Investors are welcome to explore this approach, and we will continue to introduce methods for constructing various indicators using the TEJ database and backtesting their performance. For readers interested in various trading backtests, we encourage you to consider purchasing related plans from TQuant Lab to leverage high-quality databases and develop personalized trading strategies.

Important Reminder: This analysis is for reference only and does not constitute any product or investment advice.

We welcome readers interested in various trading strategies to consider purchasing relevant solutions from Quantitative Finance Solution. With our high-quality databases, you can construct a trading strategy that suits your needs.

“Taiwan stock market data, TEJ collect it all.”

The characteristics of the Taiwan stock market differ from those of other European and American markets. Especially in the first quarter of 2024, with the Taiwan Stock Exchange reaching a new high of 20,000 points due to the rise in TSMC’s stock price, global institutional investors are paying more attention to the performance of the Taiwan stock market. 

Taiwan Economical Journal (TEJ), a financial database established in Taiwan for over 30 years, serves local financial institutions and academic institutions, and has long-term cooperation with internationally renowned data providers, providing high-quality financial data for five financial markets in Asia. 

  • Complete Coverage: Includes all listed companies on stock markets in Taiwan, China, Hong Kong, Japan, Korea, etc. 
  • Comprehensive Analysis of Enterprises: Operational aspects, financial aspects, securities market performance, ESG sustainability, etc. 
  • High-Quality Database: TEJ data is cleaned, checked, enhanced, and integrated to ensure it meets the information needs of financial and market analysis. 

With TEJ’s assistance, you can access relevant information about major stock markets in Asia, such as securities market, financials data, enterprise operations, board of directors, sustainability data, etc., providing investors with timely and high-quality content. Additionally, TEJ offers advisory services to help solve problems in theoretical practice and financial management!

Source code

Click Here to GitHub

Extended Reading

Verifying LSTM Stock Price Prediction Effectiveness Using TQuant Lab (Part 1)

LSTM Trading Signal Detection

LSTM

Back
Procesing