ETS (Error, Trend, Seasonal) is a smoothing technique for time series forecasting. With the basic assumption that the best prediction at time $t+h$ is the last observed value at $t$, i.e., $\hat{y}_{t+h|T} = y_T$, it models the time series as a combination of trend, seasonal, and error components. See exponential smoothing and ETS in statsmodels for more details.
Firstly, ETS starts from weighted average $\hat{y}_{T+h|T} = \alpha y_t + (1-\alpha) \hat{y}_{t-1}$, where $\alpha$ is the smoothing parameter. When representing this as component form, letting $\hat{y}_{t+h|t} = l_t$, $l_t = \alpha y_t + (1-\alpha) l_{t-1}$.
Holt’s linear trend method is used for the trend.
In Holt’s linear trend method, $\hat{y}_{t+h} = l_t + h b_t$, where $b_t$ is the trend component. It means that the prediction at time $t+h$ is the sum of the level at time $t$ and the trend component multipled by time gap $h$.
The trend is updated as $b_t = \beta (l_t - l_{t-1}) + (1-\beta) b_{t-1}$, meaning that the trend at time $t$ is a weighted average of the difference between the level at time $t$ and the level at time $t-1$, and the previous trend at time $t-1$.
Level is updated as $l_t = \alpha y_t + (1-\alpha)(l_{t-1} + b_{t-1})$ where $\alpha y_t$ is the same as before. $l_{t-1} + b_{t-1}$, representing the sum of previous level and previous trend, replaces $l_{t-1}$ (or, $\hat{y}_{t-1}$) in the previous level update equation.
Damped trend is a variation of the trend component. It's used when the trend is expected to slow down over time or to avoid overshooting in the forecast. It's basically similar with $y=l+b$ form but there's now a damping parameter $\phi$ applied to the trend update equation, i.e., $b_t = \beta (l_t - l_{t-1}) + (1-\beta) \phi b_{t-1}$.
We're not going to cover seasonal component in this page, but it's basically the same as the trend component. Only thing that we consider is adding a seasonal component to the level update equation.
ETS has two types of models: additive and multiplicative. So far in the above, everything was being added, e.g., $y$ is sum of level and trend. On the other hand, multiplicative model would have the form of $y = l \times b$.
To demonstrate model performance, we show the model's prediction results for the air passengers dataset. The cross validation process identified the best transformation to make the timeseries stationary and the optimal hyperparameters. The Root Mean Squared Error on the next day's closing price was used to determine the best model.
The chart below illustrates: