Feb 08 2022

Table of Contents

- Difficulty：★★★☆☆
- Stock Price Forecasting by Time Series Model
- Reminder：In this article, we would apply Time Series Model on trend forecasting, and no pre-preprocessing steps inㄒolved. Therefore, if you are not familiar with the fundamentals about Time Series, please read 【Data Analysis(10)】ARIMA-GARCH Model(Part 1) firstly.

First of all, we would implement the process to construct models so as to make you understand the application of python packages. However, in case of redundancy of this article, there is no hypothesis test. Subsequently, we would calculate the forecasted return and price. Last but not least, apply visualization to compare the prediction and actual trend to assess the result of ARMA-GARCH.

Note: we apply “ARMA” in this article, not like the previous one “ARIMA”. The difference is that ARIMA is capable of differencing and dealing with non-stationary data. We conducted ARIMA in previous one to make you understand Time Series profoundly. Here, the use of ARMA would make you know the alternative of Time Series Model.

MacOS & Jupyter Notebook

`import numpy as np`

`import pandas as pd`

`import matplotlib.pyplot as plt`

`%matplotlib inline`

`import seaborn as sns`

`sns.set()`

`import tejapi`

`tejapi.ApiConfig.api_key = 'Your Key'`

`tejapi.ApiConfig.ignoretz = True`

Security Transaction Data Table：Listed securities with unadjusted price and index. Code is ‘TWN/APRCD’.

**Step 1. Data Selection, oo5o.TW**

`data = tejapi.get('TWN/APRCD', # 公司交易資料-收盤價`

`coid= '0050', # 台灣50`

`mdate={'gte': '2003-01-01', 'lte':'2021-12-31'},`

`opts={'columns': ['mdate', 'close_d', 'roi']},`

`chinese_column_name=True,`

`paginate=True)`

`data['年月日'] = pd.to_datetime(data['年月日'])`

`data = data.set_index('年月日')`

`data = data.rename(columns = {'收盤價(元)':'收盤價', '報酬率％':'日報酬率(%)'})`

**Step 2. Data Split**

`train_date = data.index.get_level_values('年月日') <= '2020-12-31'`

`train_data = data[train_date].drop(columns = ['收盤價'])`

`test_data = data[~train_date]`

`＃ 保留test_data收盤價，用來比對模型預測值`

**Step 3. Selection of ARMA’s parameters**

Here, we apply statsmodels to select parameters, not like the previous article, where we used pmdarima.

`import statsmodels.api as sm`

`# AIC、BIC準則`

`sm.tsa.stattools.arma_order_select_ic(train_data, ic=["aic", "bic"])`

The BIC standard make us apply (p,q) = (0,0), which is derived from that BIC tends to conduct stricter selection on case of multi-variables. Therefore, like the previous article, we would use AIC, (p,q) = (2,2), to construct ARMA.

**Step 4. ARMA Model**

`from statsmodels.tsa.arima_model import ARMA`

`model = ARMA(train_data, order = (2, 2))`

`arma = model.fit()`

`print(arma.summary())`

**Step 5. GARCH Model**

`＃ 取得ARMA模型的殘差項目`

`arma_resid = list(arma.resid)`

`from arch import arch_model`

`mdl_garch = arch_model(arma_resid, vol = 'GARCH', p = 1, q = 1)`

`garch = mdl_garch.fit()`

`print(garch.summary())`

We apply ARMA to forecast the average and GARCH to modify the prediction interval.

**Step 1. Average Return Forecasting**

# len(train_data) = 4333, len(data) = 4577 forecast_mu = arma.predict(start = 4333, end = 4576) # 預測函式的end包含當期，所以需進行4577-1=4576。

According to above chart, we find that forecasted average return would gradually approach to 0. Fluctuations mainly happens during initial period.

**Step 2. Volitility Forecasting**

`garch_forecast = []`

`for i in range(len(test_data)):`

`train = arma_resid[:-(len(test_data)-i)]`

`model = arch_model(train, vol = 'GARCH', p = 1, q = 1)`

`garch_fit = model.fit()`

`prediction = garch_fit.forecast(horizon=1)`

`garch_forecast.append(np.sqrt(prediction.variance.values[-1:][0]))`

Implementing rolling forecasting to predict every single period. Hence, we code in the way making GARCH contained in the loop and store values in the list. Subsequently, we add above forecasted values to “test_data” table and compute upper and lower limits of interval.

`test_data['ARMA預測報酬(%)'] = list(forecast_mu)`

`test_data['GARCH預測波動度'] = (garch_forecast)`

`test_data['預測區間上限'] = test_data['ARMA預測報酬(%)'] + test_data['GARCH預測波動度']`

`test_data['預測區間下限'] = test_data['ARMA預測報酬(%)'] - test_data['GARCH預測波動度']`

By above chart, it is clear that most of actual returns are in the interval. However, the individual return of much volatility cannot be predicted accurately.

**Step 3. Price Forecasting**

`# 本文已經把train_data中的價格刪除，所以需重新計算2020-12-30的收盤價`

`first_price = test_data['收盤價'][0] / (1+test_data['日報酬率(%)'][0]*0.01)`

`# 計算第一期預測`

`test_data['ARMA預測價格'] = first_price * (1 + test_data['ARMA預測報酬(%)']*0.01)`

`test_data['預測價格區間上限'] = first_price * (1 + test_data['預測區間上限']*0.01)`

`test_data['預測價格區間下限'] = first_price * (1 + test_data['預測區間下限']*0.01)`

`# 計算剩餘預測區間`

`for i in range(1, len(test_data)):`

`test_data['ARMA預測價格'][i] = test_data['預測價格'][i-1] * (1 + test_data['ARMA預測報酬(%)'][i]*0.01)`

`test_data['預測價格區間上限'][i] = test_data['預測價格區間上限'][i-1] * (1 + test_data['預測區間上限'][i]*0.01)`

`test_data['預測價格區間下限'][i] = test_data['預測價格區間下限'][i-1] * (1 + test_data['預測區間下限'][i]*0.01)`

`# 計算區間均價`

`test_data['預測平均價格'] = (test_data['預測價格區間上限'] + test_data['預測價格區間下限']) / 2`

By above chart, we find that the interval expands dramatically with time elapsing. Therefore, it is not reliable enough to assess the result of prediction. we would show the first 2 month and make the conclusion.

`new_date = test_data.index.get_level_values('年月日') <= '2021-03-01'`

`new_test = test_data[new_date]`

With the period of first two month, it is clear to observe the difference between prediction and actual data. Firstly, the interval average is closer with actual price trend than that of ARMA prediction. As for the interval prediction, we can find that the actual trend does not fall in the interval area until late January.

Based on the last result, you would understand that performance of ARMA-GARCH model on 0050 is not reliable enough, despite the well-fitted model summary. We regard the increasingly wide interval as normal since forecasting should be more conservative with the farther prediction period. Nevertheless, according to first two month chart, we can tell that actual price exceeds predicted interval, which indicates that there is no reliability during the initial period. What brings about the situation may be the lacking consideration about seasonal or exogenous variable. Hence, if you are interested in relative issues, keep reading our articles. To boot, welcom to purchase the plans offered in TEJ E Shop and use the well-complete database to implement your own prediction.

- 【Data Analysis(10)】ARIMA-GARCH Model(Part 1)
- 【Quant(14)】Which industries did three primary institutional investors invest in Taiwan?

Category