GOLDEN EXPONENTIAL SMOOTHING: A SELF-ADJUSTED METHOD FOR IDENTIFYING OPTIMUM ALPHA

The conventional double exponential smoothing is a forecasting method that troubles the forecaster with a tremendous choice of its parameter, alpha. The choice of alpha would greatly influence the accuracy of prediction. In this paper, an integrated forecasting method named Golden Exponential Smoothing (GES) is proposed to solve the problem of choosing the optimum alpha. The conventional method needs human intervention in which the forecaster would determine the most suitable alpha or else the prediction accuracy will be affected. This method is reformed and interposed with Golden Section Search such that an optimum alpha could be identified during the algorithm training process. Numerical simulations of four sets of times series data are employed to test the efficiency of the GES model. The findings show that the GES model is self-adjusted according to the situation and converged fast in the algorithm training process. The optimum alpha, which is identified from the algorithm training stage, demonstrates good performance in the stage of Model Testing and Usage.

The exponential smoothing method could be classed as a big forecasting family (Muhamad & Din, 2017;Wu et al., 2016), the family members are constituted by single exponential, double exponential and triple exponential smoothing methods. The single exponential smoothing is used for time series data that has no trend or seasonal component (Crevits & Croux, 2016;Ostertagová & Ostertag, 2011). The double exponential smoothing method which comprises of Brown's one parameter linear method and Holt's two parameters method can handle time series data with trend (Feiyan et al., 2012;Shastri, 2015;Wu et al., 2016). Triple exponential smoothing, Winter's three parameters trend and seasonality method, is meant for handling the data which consists of trend and seasonality (Crevits & Croux, 2016;Wu et al., 2016).
All of the exponential smoothing methods have smoothing parameters (Crevits & Croux, 2016;Feiyan et al., 2012). The single exponential smoothing and Brown's methods are forecasting equations with one smoothing parameter (Muhamad & Din, 2017). Holt's method consists of two smoothing parameters (Muhamad & Din, 2017) and Winter's method has three smoothing parameters. These parameters are adjustable constants that would give significant impact on the accuracy of forecasting (Ostertagová & Ostertag, 2011). Unfortunately, the identification process of these parameters is difficult as no specific regulation or guidance exist (Feiyan et al., 2012;Wu et al., 2016). Most of the time, the decision is made based on the subjective experience of forecaster (Feiyan et al., 2012;Wu et al., 2016) or brutal-force search (Muhamad & Din, 2017).
Many research works had been found to improve the accuracy of the exponential smoothing method. Wu et al. used the grey accumulating generation operator to smooth the data random interference into the double exponential (Wu et al., 2016). Crevits & Croux introduced a robust alternative for forecasting equation by using robust maximum likelihood parameter estimation (Crevits & Croux, 2016). Feiyan et al. introduced a new smoothing parameter β into the conventional double exponential smoothing which only consists of one parameter α (Feiyan et al., 2012). Yet, the research in reducing the difficulty of choosing and identifying appropriate parameters is little. As a complement to the existing research work, Golden Exponential Smoothing, which Golden Section Search is incorporated into a double exponential smoothing method, is introduced for the present study. The Golden Exponential Smoothing method would be able to do forecast with its self-adjusted optimum parameter.

2.
Golden Ratio and Golden Section Search The Ancient Mathematicians believed that proportionality is the main contributor to the elegance of object (Akhtaruzzaman & Shafie, 2012;Kalajdzievski, 2008). They defined proportionality in a very explicit way, if there is a point X on a line segment ( Figure 1) (Akhtaruzzaman & Shafie, 2012;Kalajdzievski, 2008) such that Eq.(1) is true (Che et al., 2014), these ratios Eq.(1) are called golden ratio and the point X is called golden cut. Then, from algebra manipulation (Eq.(2) to Eq.(4)), the golden ratio could be equalled to √ (Hughes, 2011) which traditionally denoted by Greek Letter (Akhtaruzzaman & Shafie, 2012;Kalajdzievski, 2008) (The √ is being disregarded). It should be noted that √ = 0.618 and it varies inversely = 1.618 are supposed to express the golden ratio (Stakhov, 1989).
The Golden Section Search has used the concept of golden ratio (Couriol et al., 1998;Hughes, 2011) in narrowing the search interval of a unimodal modal function (Cai et al., 2010;Ding et al., 2017;Scherrer et al., 2013;Shao et al., 2014). The unimodal function is a function which extremum must lie in the search interval (Akira et al., 2014;Yeom et al., 2010). The Golden Section is one of the search techniques (Hughes, 2011) that is derivatives-free (Shao et al., 2014;Vieira et al., 2012), robust (Shao et al., 2014) and able to locate the extremum value in high speed (Cai et al., 2010;Ding et al., 2017;Scherrer et al., 2013;Shao et al., 2014). Its easiness has made it one of the best options for researchers to solve real life problems related to optimization. For instance, Scherrer et al. applied the Golden Section Search in optimizing the time threshold of a time-out power management policy (Scherrer et al., 2013). Tsai et al. used a Golden Section Search to determine a good shape parameter for meshless collocation methods (Tsai et al., 2010). Ding et al. proposed the hybridization of the Golden Section Search and loss model to enhance the traction system of electric vehicle (Ding et al., 2017). Cai et al. used Golden Section Search algorithm in nonlinear isoconversional calculation to determine activation energy process (Cai et al., 2010). Akira et al. used Golden Section Search to adjust the sensitivity of a tunable antenna (Akira et al., 2014). Su & Qi used Golden Section Search to classify the chaotic cloud particle swarm to improve the speed and precision of Available Transfer Capability (ATC) optimization (Su & Qi, 2014).

The Double Exponential Smoothing Method
The double exponential smoothing method is a typical forecasting equation (Feiyan et al., 2012) that predicts the future values based on the historical times series data (Shastri, 2015;Wu et al., 2016). The accuracy of the prediction depends on the parameter, alpha, in the equation (Feiyan et al., 2012;Shastri, 2015). The alpha itself is not a variable but an adjustable parameter that lies in (0,1) (Singh, 2015;Wu et al., 2016). This means that the practitioner would have to determine the appropriate value for this parameter or accuracy of the prediction will be affected (Feiyan et al., 2012). The process of parameter selection is needed every time there is an acquisition for new sets of time series data. A naïve approach of parameter selection could be simply to list all the alpha values which lie in (0,1) and pick the best alpha value based on the evaluation of the objective function (Beasley, 1992). However, this approach of complete enumeration is inefficient and not practical with enormous choices of alpha in the range (Beasley, 1992). In this research, Golden Exponential Smoothing (GES) Model was developed by modifying and integrating the conventional exponential smoothing method with the Golden Section Search. The GES was built with the intention of solving the problem in parameter selection such that parameter could be traced wisely in the algorithm training process.

The Proposed Method
The procedure of GES is described in Figure 2. In stage 1, the algorithm training process was commenced by setting the alpha values with their initial lower bound (L) and upper bound (U) that were 0 and 1, respectively. Then, two new alphas values were produced by Eq.(5) based on the golden ratio (Eq. (6)). The produced alpha values were checked to see if they satisfied (1) the stopping criteria (Eq. (9)). If the stopping criterion was not met, the algorithm training process would be executed. Then, the parameters were introduced into the modified exponential smoothing equations (Eq. (7)) for simulating the time series data. The total deviation of simulated data (Fit+1) and real data (xt+1), the errors, was measured by the objective function, Mean Absolute Errors (MAE) (Eq. (8)). The MAEs of two distinct parameters were compared to determine either the lower bound (L) or the upper bound (B) would be updated. Besides, Mean Absolute Percentage (MAPE) (Eq. (11)) was also used to measure the percentage of deviation. Even though MAE was the objective function of the algorithm training process, MAPE would be a better option for interpreting the accuracy of the GES model. The relative error measurement, MAPE, gives the ultimate user of forecast on knowing the percentage of deviation of simulated from the actual.
The algorithm training process was repeated successively with the purpose of building their internal guideline for shrinking the width of lower bound and upper so that it would get closer to the optimum alpha value (Scherrer et al., 2013;Shao et al., 2014). The training process of stage 1 would be terminated once the stopping criteria (Eq.(9)) that served as the threshold has been met. Next, in stage 2, the optimum alpha would be identified (Eq.(10)) from the final iteration. This optimum alpha was used in stage 3 (model testing) and stage 4 (usage). In both stages, the data were stimulated (Eq. (7)) by using this optimum alpha. These simulated data were the desired predictions and they were tested against the real data by using Eq. (8)

Results and Discussion
The proposed GES model was evaluated and tested by using numerical simulation. Four sets of time series data were employed in this study. They were the commodity data that were obtained from The World Bank website (http://www.worldbank.org/) which were monthly price of chicken (USD/kg), beef (USD/kg), maize (USD/ mt), and coffee (USD/kg) from Jan 1997 to Mar 2017. There was a total of 243 observations for each set. As in Figure 3, the first 228 observations served as an initialization set and were used in the algorithm training process and model testing. Eventually, the later 15 observations that were reserved as a test set were used to do the comparison study with the prediction.  Since the GES models' lower bound and upper bound were 0 and 1 respectively, the initial two alphas generated by Eq.(5) would be the same for all data sets. However, along the algorithm training process, these alphas would be adjusted and corrected through selecting the alpha that could minimize the objective function MAE. Each Graph (A) in Figure 4 to Figure  7 has clearly shown that the patterns of correcting the alphas were different from one to another. The threshold of the GES model was set to 0.0001 in the experiments. In the algorithm training process, this threshold and golden ratio (Eq.(6)) would standardise the process of shrinking the width of the interval. Hence, the training process would be terminated at iteration-18 for all data sets. Nonetheless, this act did not affect the model in identifying the optimum alpha. In all the data sets, the alphas took around ten iterations to get closer to the optimum value ( Figure 4 to Figure 7, Graph (A)), the models took around six iterations to simulate data that able to minimize the objective function ultimately (Figure 4 to Figure 7, Graph (B)). The other measurement MAPE (Figure 4 to Figure 7, Graph (C)) also shows that the model took around six iterations to have a simulated data that minimize the percentage errors. The time series plots (Graph (D)) in Figure 4 to Figure 7 show that all the data sets consisted of some trend component (the linear line in the time series graph exhibits the trend of real price). For chicken data set (Graph (D) in Figure 4) that had trend and little noise, the identified optimum alpha was 0.99986 with the objective function MAE of 0.01509 and the MAPE was 0.93069%. In Figure 5 (Graph (D)), the beef data set also exhibited trend but it fluctuated little more than the chicken data set, the optimum alpha was 0.75566. However, the beef data's MAE value of 0.10166 was lower than the chicken data set, but the MAPE was 3.41105%.
From Figure 6 and 7 (Graph (D)), data sets of maize and coffee had more fluctuations. Their identified of optimum alpha values were much lower than previous two data sets, but their MAPEs were increased. The optimum alpha of maize was 0.62421 with MAE of 8.33478 and the MAPE was 5.07720%. The highest value of MAPE was found in the coffee data set with 5.49284%, the optimum alpha and MAE of this data were 0.57769 and 0.16737 respectively (Graph (D) in Figure 7). Nonetheless, the overall performance of the GES models was considered excellent in the model testing stage since the MAPE of all data sets were less than 6%.  After the model testing stage, the identified optimum alphas were used for testing the reserved test sets so that the accuracy of predicted data against real data was verified. The accuracy of GES models was again checked by using the analysis of errors. In Table 1, the percentages of deviation show that GES models have demonstrated great ability in forecasting. The pre-identified optimum alphas from the model testing stage were able to continue their legends in model usage stage. None of these percentages of deviation (MAPE) were found more than 5%. Hence, it shows that these optimum alphas were able to make prediction close to the real data. The lowest percentage of deviation was found in chicken data test set with 0.83374%. Then, it was followed by beef with 2.519718%. The third was the coffee data set with a percentage deviation of 3.89231%. Lastly, the highest MAPE was Maize data test set with 4.27514%. The results show that the pre-identified optimum alphas were able to preserve their good records in forecasting.

Conclusion
The numerical simulations have shown that the proposed GES model provides favourable result in searching the optimum alpha and making predictions. The predefined threshold value might limit the iterations of the algorithm training process but the self-corrective and fast converging nature of the model made the optimum alpha able to be traced within iterations.
In these four sets of time series data that comprised of trend components, the value of alpha was close to one when the data exhibited little noise (fluctuation). The value of alpha was reduced when the fluctuation of time series data increased. The effectiveness of the GES model in making a prediction in the stage of model testing and model usage also depends on the amount of noise in the data set. The accuracy of prediction was high when there was little noise but it was slightly defective when the fluctuation increased. Nevertheless, the deviation of predicted data from actual data was still in the range of excellence since the MAPEs were less than 6%. For future work, the researcher may work on adaptive optimum alpha whereby the value might be adjusted by small increments from time to time basis so that the accuracy of the model would further increase.