PERFORMANCE OF KUALA LUMPUR COMPOSITE INDEX STOCK MARKET

Financial Times Stock Exchange (FTSE) Bursa Malaysia Kuala Lumpur Composite Index (KLCI) is made up of over 30 large companies listed on the Bursa Malaysia Main Market. All FTSE Bursa Malaysia data are calculated and disseminated every 15 seconds in real-time. It is believed that the volatility of the stock market has a negative impact on real economic recovery. This paper aims to describe the underlying structure and the phenomenon of the sequence of observations in the series. The information obtained, can determine the performance of time series model to fit the data series from January 2002 until December 2018. Autoregressive Integrated Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) models have been shown to provide the correct trend of volatility. The objectives of this paper are to determine the overall trend of the KLCI stock return and to investigate the performance of Generalized Autoregressive Conditional Heteroscedasticity (GARCH) and Autoregressive Integrated Moving Average (ARIMA) based on KLCI stock return. Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) have been chosen to be used in this paper to measure accuracy. The results show that the best ARIMA model is ARIMA(1,1), while for the GARCH model, it is GARCH(1,1).


Introduction
Bursa Malaysia Kuala Lumpur Composite Index (KLCI) Financial Times Stock Exchange (FTSE) consists of over 30 big businesses listed on Bursa Malaysia Main Market. All big companies have completed market capitalization that meets the eligibility criteria of the FTSE Bursa Malaysia ground regulations (FTSE Russell, 2018). Based on a weighted value formula, the data is calculated and adjusted by a free float factor, using Bursa Malaysia's real time values and closing prices. Policy makers are interested in the impacts of volatility on actual activity, while the market participants are worried about the impacts of stock market volatility on asset pricing. At the same time, the KLCI index time series also retains the historical motions of the Malaysian stock market.

The Data Set
This study uses secondary data obtained from Yahoo Finance through its website (https://finance.yahoo.com). The data consists of the numbers of stock returns in Malaysia from January 2002 to December 2018. Since, the annual time series data used with the model would favourably generate better outcomes with quarterly or monthly numbers where information can be easily obtained, this study uses monthly data.

Time Series Models
Data containing collection of information that generally takes place at standardized periods is recognized as Time Series data by a model of successive quantitative data pairs (Kenton, 2018). Time series forecasting is a system that predicts potential outcomes depending on current established characteristics (Brownlee, 2017). This is linked to the research conducted in China by Chen & Pan (2016) and in Korea by Han et al. (2015). The stock market behaviour could have an impact on investments equity as a consequence of elevated stock market volatility. Due to this, shareholders face a strong investment risk, no matter how big it may be at a comparable moment.

GARCH Model
GARCH model is one of the heteroscedasticity model, where there is an absence of constant variance in the model. This is an upgraded Autoregressive Conditional Heteroscedasticity (ARCH) model by Bollerslev in 1986, which included a word "smoothing-averaging" to create a more parsimonious specification. But both models have considered clustering volatility and forecast time -varying high -frequency financial data as important elements (Islam, 2013). GARCH model should be written as GARCH (q, p) model where q is the moving average number (MA) and p is the autoregressive number (AR). It is possible to follow the general GARCH (q, p) model based on the equation below.
(1) where µ is smooth underlying process, is known as the apparent irregularities in the process or mainly known as noise. Meanwhile, where ht represents conditional variance, ht−i represents past conditional variance, e² t−i past squared residual return and a > 0, bi ³ 0, yi ³ 0.

Box-Jenkins ARIMA Model
These models are combination of Autoregressive Moving Average (ARMA) and ARIMA (Sivakumar & Mohandas, 2009). ARIMA model is one of the statistical models for analysing and forecasting time series data. A standard notation is used for ARIMA (p, d, q) where the parameters are rapidly substituted by integer numbers to show the particular model used for ARIMA. The parameters of ARIMA model shows 'p' as the number of lag observations included in the model (AR), 'd' as the number of times that the raw observations are different (I) and 'q' as the size of the moving average window (MA). With the specified number and type of terms, a linear regression model is constructed, and the data are prepared by a degree of differencing to make them stationary, instantly removing trend and seasonal structures that negatively affect the regression model. To fulfil the assumption of ARIMA models, the data variance should be constant. A general ARIMA equation (p, d, q) model such as ARIMA (1,1,1) is written as: ( 3) where represents the first differences of the series and is assumed stationary.
In this case, the values of p=1, d=1 and q=1.

Unit Root Test Procedure
The non-stationary of a series can be determined by either a simple observation of the plotted data or more accurately by using statistical test procedure. In this paper, the Augmented Dickey-Fuller test (ADF) procedure is most commonly used. The ADF test is performed by using the model: where J is the number of lags for with with and t is the time variable. Normally, J is chosen small in order to save the degree of freedom but is large enough to ensure that is white noise, where is identically and independently distributed with mean zero and variance . The hypothesis testing of ADF is to examine whether there is an existence of unit root (not stationary) or stationary. If the data are not stationary, they must be transformed by using the first difference. The first difference is the data changed from one period to the next one. By plotting the first differencing to data, it could reveal whether the data have been transformed into stationary series or not. If the data are still not stationary after the first differencing, the second differencing is required.

Model Performance
Akaike Information Criterion (AIC) is used to select the best model fitting when value of AIC is smaller. AIC is described as shown, is the set (vector) of model parameters, is the likelihood of the candidate model given when the data are evaluated at the maximum likelihood estimate of and is the number of estimated parameters in the candidate model. The Bayesian Information Criterion (BIC), proposed by Schwarz is also known as the Schwarz Information Criterion (SC). The difference between the BIC and the AIC is that the former imposes a greater penalty than the latter for the number of parameters. BIC, on the other hand, is defined as follows: where is the set (vector) of model parameters, is the likelihood of the candidate model given when the data are evaluated at the maximum likelihood estimate of and is the number of estimated parameters in the candidate model. One of the primary aims of forecasting is to estimate potential outcomes using the correct technique. The precision of the predictive model is often evaluated by calculating the predictive precision criteria. RMSE and MAPE are often used to measure accuracy. It is outlined as: (7) and (8) where is the actual value, is the forecast value and is the number of period. The better model can be defined by the smaller values of RMSE and MAPE.

Results
The results from Figure 1    Furthermore, trend components have been found to exist in the data series. However, no irregular, seasonal and cyclical components exist along the lines. In this graph, the data set is seen as volatile when fluctuations occurred at around 600 levels to 1500 levels from 1/1/2002 until 1/1/2012 where the trend component could clearly be seen. However, starting from 1/1/2012 until 9/1/2018, the trend still exists but the fluctuation of the data set ranges from 1400 levels to 1800 levels.

Data Investigation
The data consists of 204 observations starting from January 2002 until December 2018. The estimation part consists of 144 observations and 60 observations were from the evaluation part. Table 1 and Figure 2 show the composite index has a unit root. It can be seen that the data are not stationary since the probability value of Augmented Dickey-Fuller Test in Table  1 is 0.6076 which is more than the value of alpha, 0.05. This indicates that the non-seasonal differencing needs to be done to achieve stationary condition. Meanwhile, the graph in Figure  2 is a correlogram that shows observations which have large value of ACF and the pattern has slowly disintegrated. Moreover, PACF has one significant spike at lag 1. The probability of all observations has a significant value which is less than 0.05.

Performing Non-Seasonal Differencing
Non-seasonal differencing is used to get the stationary condition. So, differencing would be used to remove systematic pattern or trend components from the data. Moreover, the nonseasonal differencing not only can form first order differencing, but it could also be run repeatedly until data are stationary. Table 2 shows the data are already stationary after the first order of differencing. It is seen on the Augmented Dickey-Fuller test statistics. The probability is 0.000 which is less than the value of alpha, 0.05. In the correlogram in Figure 3, the value of ACF is small. Meanwhile, the PACF has one significant spike at lag 10. Therefore, identifying the models involved can be done by computing the sample PACF, p and ACF, q. ARIMA model is denoted by ARIMA (p, d, q). In the order of differencing, d is equal to 1 taking the first difference of the series. The corresponding value of p is 1 and the value of q is 1. To obtain the parameter values for each model, EViews software is used.

Data Investigation
Since KLCI data are not stationary, first differencing of the data is needed. For the line graph and histogram, the differencing levels were plotted from GARCH model. GARCH model can only be used when the data are volatile. Figure 4 shows the histogram for Kuala Lumpur Composite Index data at first differencing level. The line graph shows that the data of KLCI had volatility in the year 2008. This is supported with a histogram and descriptive analysis in Figure 5.

Histogram for First Differencing Level of KLCI
The value of kurtosis for this analysis is 5.825304 which is more than 3 and skews to the left. From the output above, the maximum value is 12.70322 and the minimum value is -16.51417. The average for this series is 0.426069 with a standard deviation of 3.634473. The next step is to identify the model involved by computing the sample PACF, p and ACF, q. GARCH model is denoted by GARCH (q, p). The corresponding values of p and q are 1 and 2.

Comparison between ARIMA Models
Evaluation part is used to compare the performance of all the ARIMA models. The models' performance used are AIC, SC and a number of significant variables. In order to compare the ARIMA models more efficiently, the error estimations values such as RMSE and MAPE are analysed.  Table 3 shows the values of all the statistical measures. The AIC and SC show almost the same figures. Hence, RMSE and MAPE were obtained to compare the errors between models and to find the least value of error. The AIC, SC, RMSE and MAPE show almost the same figures for each model. The number of significant variables and the variables involved are considered to determine that ARIMA (1,1,1) as the best model since ARIMA (1,1,1) contains 1 out of 2 variables is significant.

Comparison between GARCH Models
Further analysis was done to compare with the GARCH models as shown in Table 4. The AIC, SC, a number of significant variables and also the error estimations were used to achieve the objectives. The lowest AIC and SC is identified for GARCH (1,1). The value of RMSE for GARCH (1,1) is 40.90108 and value of MAPE is 1.647338. Hence, GARCH (1,1) was chosen as the best model after considering all the statistical values and conditions.