A COMPARATIVE STUDY OF UNIVARIATE TIME SERIES MODELLING FOR NATURAL RUBBER PRODUCTION IN MALAYSIA

Malaysia is one of the top countries that produces natural rubber and was ranked sixth place globally. The earnings from natural rubber products are making billions of ringgit for the country. However, over the past years the natural rubber production in Malaysia has been inconsistent and the deficiencies in the production can affect Malaysia’s economy. Therefore, it is important for relevant agencies and departments to understand the patterns and trends of natural rubber production in Malaysia besides having the ability to forecast. Hence, the integrated autoregressive moving average (ARIMA), seasonal autoregressive moving average (SARIMA) and the seasonal Holt-Winter’s model were being considered for the purpose of modelling and forecasting this study. The forecast accuracy criteria used to evaluate the performance of the models are the root mean square error (RMSE) and mean absolute percentage error (MAPE). The results showed that the seasonal Holt-Winter’s model appeared to be the best model as it yielded the lowest RMSE and MAPE values. The seasonal HoltWinter’s model, however, is not a good choice of model as it was unable to forecast six months ahead values. On the other hand, the SARIMA model had a better forecast ability when forecasting the values for the same duration. Therefore, the SARIMA model is taken to be the model in forecasting the natural rubber production in Malaysia for that period. This study has shown that the best fit model that fulfil all the forecast accuracy criteria may not have the best forecast ability.


Introduction
Natural rubber is a hydrocarbon polymerized milky colloidal suspension liquid substance known as latex (Hays, 2015) where the natural and synthetic rubber are two common types of rubber (Arumugam and Anithakumari, 2013).Since Malaysia have suitable climate and soils, rubber tree can be planted almost anywhere in Malaysia (Krishnan, 2011).Malaysia was one of the top countries that produced natural rubber and ranked the sixth place globally in the natural rubber production behind Thailand, Indonesia, Vietnam, India and China (Economic History Malaya, 2016).However, Malaysia still imports natural rubber from other countries.According to the Rubber Journal Asia (2017), one of the important contributors to Malaysia's economy is natural rubber.The earnings from natural rubber products are still making billions of ringgit and also manage to provide employment to society from urban to rural areas.The Malaysian Government was targeting to expand more in the manufacturing sector of natural rubber to ensure Malaysia's standing at the international level for rubber industry (Matade, 2016).Hence, the Malaysian Government has made the rubber plantation as one of the priority in its economic development.Rubber industry could provide RM 52.9 billion to Malaysia's Gross National Income (GNI) by 2020 based on current projections.This shows that natural rubber is essential to the growth of Malaysia's economy and industries.Therefore, in order to make predictions, we use time series models as the purpose of time series analysis is to understand the patterns that are evolving over time and exert these patterns to predict future behaviour (Farrelly, 2017).
The integrated autoregressive moving average (ARIMA) models which have become popular in recent times and had been used in various fields like gold analysis (Guha and Bandyopadhyay, 2016), modelling of rice production (Biswas &Bhattacharyya, 2013 andHamjah, 2014) and natural rubber price analysis (Khin, Chong, Mohammed and Shamsudin, 2007).The ARIMA models had also been applied in modelling the natural rubber production in India and Thailand by Chawla and Jha (2009) and Kosanan and Kantanantha (2014) respectively.The ARIMA models with a seasonal component which is the integrated seasonal autoregressive moving average (SARIMA) models are useful for data series that have seasonal effects (Hipel and McLeod, 1994).Wang, Feng and Liu (2013) as well as Yamoah, Saeed and Karim (2016) applied SARIMA models in the study of precipitation in China and Ghana respectively while Chang and Liao (2010) forecasted tourist arrivals in Taiwan using SARIMA model.The application of the SARIMA model to forecast natural rubber production was done by Arumugan and Anithakumari (2013) in India.Besides the SARIMA models, another model that can handle seasonality effect is the seasonal Holt-Winter's model by Holt and Winter (Hyndman and Anthanasopoulos, 2012).This model is one of the best techniques for time series data that has both seasonal and trend components.Some of its recent applications were found in the study of cloud resource provisioning (Shahin, 2016) and to forecast short-term electricity demand in England and Wales (Ruiter, 2017) as well as in Ireland (Kavanagh, 2017).The seasonal Holt-Winter's model was also applied in the study to forecast natural rubber production in India (Arumugam and Anithakumari, 2013).
Although natural rubber production in Malaysia is an important economic contributor to the country, the inconsistency in the production level is a major concern to the exportation activities in the agriculture sector.In the year 2017, the fluctuation of rubber production was obvious between April and July where the production dropped by 3.4% in July compared to the growth of 8.7% in the previous month (Rubber Journal Asia, 2017).Besides, the drop in the natural rubber stocks in April 2017 by 9.7% has resulted in a decline in the agriculture sector of the Malaysia's exportation activities.As such, in order to meet the continuous demand in natural rubber globally, it is important for the relevant agencies and departments to understand the patterns and trends besides having the ability to forecast rubber production in Malaysia.
A previous study done by Khin, Chong, Mohammed and Shamsudin (2008) had used econometric analysis to model and forecast natural rubber production in Malaysia.However there are still not many studies done in the similar field.Most of the studies done were related to the modelling of natural rubber focusing on the natural rubber prices like the those found in Mdludin, Applanaidu and Abdullah (2016), Khin and Thambiah (2015) and In (2012).Therefore, in this study, we consider modelling and forecasting the natural rubber production in Malaysia using some univariate time series models, namely, the ARIMA, SARIMA and seasonal Holt-Winter's models.The methodology which includes the time series modelling procedures will be discussed in Section 2 while the results of the study will be presented in Section 3. Finally, the conclusions are contained in Section 4.

The Data Set
The statistics of natural rubber production in Malaysia were collected by the Department of Statistics Malaysia (Department of Statistic Malaysia, 2017).The data of natural rubber production (in tonnes) were recorded from all rubber plantations in Malaysia.The monthly data of natural rubber production from the year 2006 to 2016 that consist of 132 observations were obtained from Malaysian Rubber Board.The time series plot of the data is shown in Figure 1.

Time Series Modelling Procedures
For the purpose of time series modelling, all the 132 observations were divided into two parts where the first 120 observations (from January 2006 to December 2015) were used to fit the univariate times series models of ARIMA, SARIMA and seasonal Holt-Winter's while the remaining 12 observations were kept for the post sample forecast accuracy check.The process of model fitting was done using the computer software R programming language.The accuracy of the forecasts in this study were evaluated using the root mean square error (RMSE) and mean absolute percentage error (MAPE) as given in Eq. ( 1) and Eq. ( 2) respectively where   are the actual observed values and  ̂ are the predicted values while  is the number of predicted values.

ARIMA Model
ARIMA models are combined functions of the autoregressive (AR) and moving average (MA) models.This model is formulated when the stationarity assumption of the variable is not met (Lazim, 2016).The general term of the model is ARIMA (p, d, q), where p and q are the orders of the AR process and MA process respectively while d is the order of integration or differencing process.The ARIMA model is as given in Eq. (3).∅()(1 − )    = ()  (3) where {Zt} is an uncorrelated random variable with zero mean and constant variance denoted by {  } ~ WN(0, 2 ) while ∅() and () are polynomials of degrees p and q, respectively.The function ∅() ≠ 0 for |z| ≤ 1 and B is backward shift operator (Brockwell and Davis, 2002) The forecast package in the R programming language was used to model up the best ARIMA model.A function called "auto.arima" was used to find the best ARIMA model for the time series with some order constraints where the order constraints consist of the order of first differencing, restriction to find a model with non-seasonality, information criterion to be used in model selection, the instruction to use stepwise selection and Box-Cox transformation parameter.The order of first differencing was set to be 1 in order to remove the trend.The information criterion used in this model was the AICC (bias-corrected Akaike's information criterion) statistic while the Box-Cox transformation parameter was set at 0 in order to stabilize the variability.The other order constraints that were not set returned as default.The coefficient of the ARIMA model can be obtained from the summary function (Ayoade et. al, 2018).The best model was then used to forecast the subsequent 12 values using the forecast function with the order constraints where the number of steps ahead of forecast values was set at 12 and the prediction boundaries set at 95%.The prediction boundaries were used to check the validity of the model.

SARIMA Model
SARIMA models deal with both the seasonal and non-seasonal differencing, d and D respectively (Lazim, 2016).SARIMA (p,d,q) (P,D,Q)s processes with positive integer d and D. The SARIMA model defined is as shown in Eq. ( 4).
The process of fitting the SARIMA model is similar to that of the ARIMA model but with an additional constraint which is the restriction to find the models with seasonality.

Seasonal Holt-Winter's Model
A simple exponential smoothing was extended to allow forecasting of data with a trend series by Holt (1957) while Winters (1960) extended Holt's model to capture seasonality in the data.There are two types of seasonal Holt-Winter's model which are the addictive and multiplicative method.The addictive method is preferred if the seasonal variations are constant throughout the series while the multiplicative method is chosen when the variations in seasonal changed proportional to the level of the series (Hyndman & Athanasopoulos, 2012).The seasonal Holt-Winter's equation consist of three smoothing components which are level  ̂, trend,  ̂ and seasonal, ̂ +ℎ components.The generalization of the forecast function is shown in Eq. ( 5) (Brockwell & Davis, 2002).
+ℎ =  ̂ +  ̂ℎ + ̂ +ℎ , h = 1, 2, … (5) The function called "Holt Winters" is used to find the Holt-Winter's model for time series in R language.The seasonal variety "addictive" is used to handle the seasonality in the time series.The model with the best coefficients is used to forecast the subsequent 12 values using the forecast function with the order constraints set at 12 and prediction boundaries set at 95%.

RESULTS
The monthly forecast from January to December 2016 of natural rubber production in Malaysia using the ARIMA model, SARIMA model and seasonal Holt-Winter's model are presented in this section.

ARIMA Model
The ARIMA models together with their respective AICC statistics obtain using the R programming language are as displayed in Figure 2. The ARIMA model with the lowest AICC value is ARIMA (3, 1, 1) given by Eq. ( 6).  = 0.4916 −1 − 0.2872  −2 − 0.1865  −3 +   − 0.7434  −1 (6) where {  } ~  (0,0.02956) and AICC value = 73.712.The forecasted values and their 95% confidence intervals together with the actual values for the year 2016 were tabulated in Table 1 while the plot of these values are shown in Figure 3.It is clear from Table 1 and Figure 3 that not all actual values were within the 95% forecast intervals as the actual values for the month of May 2016 as highlighted in bold is found below the lower bound of the 95% forecast intervals.
Figure 2: Results of the ARIMA models and their AICC values obtained using R.

SARIMA Model
The list of SARIMA models together with their AICC values obtained from R programming language are as displayed in Figure 4.The best SARIMA model which has the lowest AICC value is SARIMA (0, 1, 1) (0, 1, 1)12.The model is given by Eq. ( 7).
where {  } ~  (0,0.02022) and AICC value = 100.644.Table 2 shows the monthly actual and forecast values of the natural rubber production from January to December 2016 together with their 95% confidence intervals while the plot of these values are shown in Figure 5.All the actual natural rubber production fall within the 95% forecast boundaries.
Figure 4: Results of the SARIMA models and their AICC values obtained using R.
+ℎ =  ̂ +  ̂ℎ + ̂ +ℎ , ℎ = 1, 2, … (8) where  ̂+1 = 0.352( +1 − ̂ +1− ) + 0.648( ̂ +  ̂ )  ̂+1 =  ̂ ̂ +1 = 0.481( +1 −  ̂+1 ) + 0.519 ĉ +1− The value for alpha which is 0.352 is close to zero.This indicates that the forecasts are based on both the recent and less recent observations.The beta value is zero shows that the estimate of the slope  ̂ of the trend component is not updated over the time series thus, it is set to the original value.This means that when the level change over time series, the slope remains roughly the same.The gamma value of 0.481 which is near to zero indicates that the older data values weigh more than the new ones.The monthly forecasts for the natural rubber production in Malaysia from January 2016 to December 2016 are tabulated in Table 3.The actual and predicted values together with 95% forecast intervals are shown in Figure 7. Table 3 and Figure 7 clearly show that all the actual values for the natural rubber production are within the 95% forecast boundaries.The results in Section 3.1 shows that the ARIMA (3, 1, 1) is unable to forecast the natural rubber production in Malaysia well as one of the actual is found outside the 95% forecast boundaries.Therefore, in this section, we only compare the RMSE and MAPE values of the SARIMA (0, 1, 1) (0,1,1)12 and seasonal Holt-Winter's models in order to find the better model.
From Table 4, it is obvious that between the two models, the seasonal Holt-Winter's model is a better model as it yields the lower RMSE and MAPE values which are highlighted in bold.Next, we compare the forecast ability of the SARIMA and seasonal Holt-Winter's model where the monthly forecasts from January to June 2017 of natural rubber production in Malaysia using the SARIMA (0, 1, 1) (0, 1, 1)12 and seasonal Holt-Winter's model are tabulated in Table 5 and 6 respectively.Table 5 shows that all the actual natural rubber production values fall within the 95% forecast boundaries of the SARIMA (0, 1, 1) (0, 1, 1)12 model but the actual value for the month of March 2017 (highlighted in bold) is found above the upper boundary of the 95% forecast boundaries when using the seasonal Holt-Winter's model as tabulated in Table 6.This clearly indicates that the SARIMA model has a better forecast ability.

Conclusions
The aim of this study is to model and forecast the natural rubber production in Malaysia using univariate time series models, namely the ARIMA, SARIMA and seasonal Holt-Winter's model.The results show that all the actual monthly natural rubber production for the year 2016 were within the 95% forecast boundaries of the SARIMA and seasonal Holt-Winter's models but not the ARIMA model.Between the two models, the seasonal Holt-Winter's model appeared to be the better model as it yielded the smaller RMSE and MAPE values.However, when the seasonal Holt-Winter's model is used to forecast the six months ahead values (from January to June 2017), one actual values was found outside the 95% forecast boundaries.
Instead, the SARIMA model could forecast well for the same duration.From this study, it can be concluded although the ARIMA model is one of the basic models in time series analysis it is not necessarily suitable for all kinds of dataset.Besides, this study also shows that the best fit model that fulfil all the forecast accuracy criteria may not have the best forecast ability.
Hence the SARIMA model is the alternative model in this study.Since this study only considers univariate time series models which are very sensitive towards sudden changes, therefore it is important to update the models when new data become available.The use of the intervention time series and multiple time series can be considered for further study as these models consider other variables besides time alone in forecasting the production of natural rubber in Malaysia.

Figure 1 :
Figure 1: The time series plot from January 2006 to December 2016.

Figure 3 :
Figure 3: The forecasted and actual values of monthly natural rubber production in Malaysia from January to December 2016.

Figure 7 :
Figure 7: The forecasted and actual values of monthly natural rubber production from January to December 2016.

Table 1 :
The monthly forecasts of natural rubber production in Malaysia from January to December 2016 using the ARIMA (3, 1, 1).

Table 6 :
The monthly forecasts of natural rubber production in Malaysia from January to June 2017 using the seasonal Holt-Winter's model.