The constant b is relatively stable in each segment of the series, but may change slowly over time. If appropriate, then one way to isolate the true value of b , and thus the systematic or predictable part of the series, is to compute a kind of moving average, where the current and immediately preceding "younger" observations are assigned greater weight than the respective older observations. Simple exponential smoothing accomplishes exactly such weighting, where exponentially smaller weights are assigned to older observations.
The specific formula for simple exponential smoothing is:.
How To Identify Patterns in Time Series Data: Time Series Analysis
When applied recursively to each successive observation in the series, each new smoothed value forecast is computed as the weighted average of the current observation and the previous smoothed observation; the previous smoothed observation was computed in turn from the previous observed value and the smoothed value before the previous observation, and so on. Thus, in effect, each smoothed value is the weighted average of the previous observations, where the weights decrease exponentially depending on the value of parameter alpha. If is equal to 1 one then the previous observations are ignored entirely; if is equal to 0 zero , then the current observation is ignored entirely, and the smoothed value consists entirely of the previous smoothed value which in turn is computed from the smoothed observation before it, and so on; thus all smoothed values will be equal to the initial smoothed value S 0.
Values of in-between will produce intermediate results. Even though significant work has been done to study the theoretical properties of simple and complex exponential smoothing e. For example, empirical research by Makridakis et al. Thus, regardless of the theoretical model for the process underlying the observed time series, simple exponential smoothing will often produce quite accurate forecasts. Gardner discusses various theoretical and empirical arguments for selecting an appropriate smoothing parameter.
Obviously, looking at the formula presented above, should fall into the interval between 0 zero and 1 although, see Brenner et al. Gardner reports that among practitioners, an smaller than. However, in the study by Makridakis et al. After reviewing the literature on this topic, Gardner concludes that it is best to estimate an optimum from the data see below , rather than to "guess" and set an artificially low value. Estimating the best value from the data. Then is chosen so as to produce the smallest sums of squares or mean squares for the residuals i.
Show Me Shiny - Gallery of R Web Apps
The most straightforward way of evaluating the accuracy of the forecasts based on a particular value is to simply plot the observed values and the one-step-ahead forecasts. This plot can also include the residuals scaled against the right Y -axis , so that regions of better or worst fit can also easily be identified.
This visual check of the accuracy of forecasts is often the most powerful method for determining whether or not the current exponential smoothing model fits the data. In addition, besides the ex post MSE criterion see previous paragraph , there are other statistical measures of error that can be used to determine the optimum parameter see Makridakis, Wheelwright, and McGee, :.
Mean error: The mean error ME value is simply computed as the average error value average of observed minus one-step-ahead forecast. Obviously, a drawback of this measure is that positive and negative error values can cancel each other out, so this measure is not a very good indicator of overall fit. Mean absolute error: The mean absolute error MAE value is computed as the average absolute error value. If this value is 0 zero , the fit forecast is perfect. As compared to the mean squared error value, this measure of fit will "de-emphasize" outliers, that is, unique or rare large error values will affect the MAE less than the MSE value.
Sum of squared error SSE , Mean squared error.
~ Broaden your Horizon
These values are computed as the sum or average of the squared error values. This is the most commonly used lack-of-fit indicator in statistical fitting procedures. Percentage error PE. All the above measures rely on the actual error value. It may seem reasonable to rather express the lack of fit in terms of the relative deviation of the one-step-ahead forecasts from the observed values, that is, relative to the magnitude of the observed values. For example, when trying to predict monthly sales that may fluctuate widely e. In other words, the absolute errors may be not so much of interest as are the relative errors in the forecasts.
To assess the relative error, various indices have been proposed see Makridakis, Wheelwright, and McGee, The first one, the percentage error value, is computed as:. Mean absolute percentage error MAPE. As is the case with the mean error value ME, see above , a mean percentage error near 0 zero can be produced by large positive and negative percentage errors that cancel each other out.
Thus, a better measure of relative overall fit is the mean absolute percentage error. Also, this measure is usually more meaningful than the mean squared error. Automatic search for best parameter. A quasi-Newton function minimization procedure the same as in ARIMA is used to minimize either the mean squared error, mean absolute error, or mean absolute percentage error. In most cases, this procedure is more efficient than the grid search particularly when more than one parameter must be determined , and the optimum parameter can quickly be identified.
The first smoothed value S 0. A final issue that we have neglected up to this point is the problem of the initial value, or how to start the smoothing process. Depending on the choice of the parameter i. As with most other aspects of exponential smoothing it is recommended to choose the initial value that produces the best forecasts. On the other hand, in practice, when there are many leading observations prior to a crucial actual forecast, the initial value will not affect that forecast by much, since its effect will have long "faded" from the smoothed series due to the exponentially decreasing weights, the older an observation the less it will influence the forecast.
The discussion above in the context of simple exponential smoothing introduced the basic procedure for identifying a smoothing parameter, and for evaluating the goodness-of-fit of a model. In addition to simple exponential smoothing, more complex models have been developed to accommodate time series with seasonal and trend components. The general idea here is that forecasts are not only computed from consecutive previous observations as in simple exponential smoothing , but an independent smoothed trend and seasonal component can be added.
Gardner discusses the different models in terms of seasonality none, additive, or multiplicative and trend none, linear, exponential, or damped. Additive and multiplicative seasonality. Many time series data follow recurring seasonal patterns. For example, annual sales of toys will probably peak in the months of November and December, and perhaps during the summer with a much smaller peak when children are on their summer break. This pattern will likely repeat every year, however, the relative amount of increase in sales during December may slowly change from year to year. Thus, it may be useful to smooth the seasonal component independently with an extra parameter, usually denoted as delta.
Seasonal components can be additive in nature or multiplicative. For example, during the month of December the sales for a particular toy may increase by 1 million dollars every year. Thus, we could add to our forecasts for every December the amount of 1 million dollars over the respective annual average to account for this seasonal fluctuation.
In this case, the seasonality is additive. Thus, when the sales for the toy are generally weak, than the absolute dollar increase in sales during December will be relatively weak but the percentage will be constant ; if the sales of the toy are strong, than the absolute dollar increase in sales will be proportionately greater. Again, in this case the sales increase by a certain factor , and the seasonal component is thus multiplicative in nature i. In plots of the series, the distinguishing characteristic between these two types of seasonal components is that in the additive case, the series shows steady seasonal fluctuations, regardless of the overall level of the series; in the multiplicative case, the size of the seasonal fluctuations vary, depending on the overall level of the series.
- The Tartar Steppe;
- Time Series Analysis in Python - A Comprehensive Guide with Examples – ML+.
- The Sentinels: Stone of Tymora, Book III?
- Shampoo Sales Dataset.
The seasonal smoothing parameter. In general the one-step-ahead forecasts are computed as for no trend models, for linear and exponential trend models a trend component is added to the model; see below :. In this formula, S t stands for the simple exponentially smoothed value of the series at time t , and I t-p stands for the smoothed seasonal factor at time t minus p the length of the season.
Thus, compared to simple exponential smoothing, the forecast is "enhanced" by adding or multiplying the simple smoothed value by the predicted seasonal component. This seasonal component is derived analogous to the S t value from simple exponential smoothing as:. Put into words, the predicted seasonal component at time t is computed as the respective seasonal component in the last seasonal cycle plus a portion of the error e t ; the observed minus the forecast value at time t.
Considering the formulas above, it is clear that parameter can assume values between 0 and 1. If it is zero, then the seasonal component for a particular point in time is predicted to be identical to the predicted seasonal component for the respective time during the previous seasonal cycle, which in turn is predicted to be identical to that from the previous cycle, and so on. Thus, if is zero, a constant unchanging seasonal component is used to generate the one-step-ahead forecasts.
If the parameter is equal to 1, then the seasonal component is modified "maximally" at every step by the respective forecast error times 1- , which we will ignore for the purpose of this brief introduction. In most cases, when seasonality is present in the time series, the optimum parameter will fall somewhere between 0 zero and 1 one.
Linear, exponential, and damped trend. To remain with the toy example above, the sales for a toy can show a linear upward trend e. Each type of trend leaves a clear "signature" that can usually be identified in the series; shown below in the brief discussion of the different models are icons that illustrate the general patterns.