ARIMA (Autoregressive Integrated Moving Average) is a popular statistical method used for time series forecasting. It combines three components: AR (Autoregressive), I (Integrated), and MA (Moving Average). ARIMA is widely used as it can handle various types of time series data, including those with trends and seasonality.
AR (Autoregressive) model, specifically $AR(p)$, uses the previous (or, lagged) $p$ values to predict the current.
Let's say the current bitcoin price at time $t$ is $Y_t$. The $AR(p)$ model states that $Y_t = \mu + a_1 Y_{t-1} + a_2 Y_{t-2} + \cdots + a_p Y_{t-p} + \epsilon_t$, where $\mu$ is the mean of the process, and $\epsilon_t$ is the error term. In other words, the current value is a linear combination of the previous $p$ values.
For example, if $Y_t = 0.2 Y_{t-1} + \epsilon_t$ of $AR(1)$ process is shown below:
MA (Moving Average) model, specifically $MA(q)$, uses the previous $q$ errors to predict the current.
Let's say the current bitcoin price is $Y_t$. The MA(q) model states that $Y_t = \mu + \epsilon_t + b_1 \epsilon_{t-1} + b_2 \epsilon_{t-2} + \cdots + b_q \epsilon_{t-q}$, where $\mu$ is the mean of the process, and $\epsilon_t$ is the error term at time $t$. In other words, the current value is a linear combination of the previous $q$ errors. If we consider errors as shocks to the system, the MA(q) model describes how the shocks propagate through the system over time.
For example, if $Y_t = \epsilon_t + 0.2 \epsilon_{t-1}$ of $MA(1)$ process is shown below:
ARMA (Autoregressive Moving Average) model, specifically $ARMA(p, q)$, uses the previous $p$ values and the previous $q$ errors to predict the current value.
To make the explanation easy, let's define $B$, the back-shift operator, $B Y_t = Y_{t-1}$. Using the operator, we can define $AR(1)$ and $MA(1)$ as follows, when ignoring the mean $\mu$:
As an example, $ARMA(1, 1)$ of $(1-0.2B)Y_t = (1+0.3B)\epsilon_t$ means $Y_t = 0.2 Y_{t-1} + 0.3 \epsilon_{t-1} + \epsilon_{t}$.
ARIMA (Autoregressive Integrated Moving Average) model, specifically $ARIMA(p, d, q)$, takes the difference of the time series $d$ times and then applies the $ARMA(p, q)$ model to the differenced series.
If $d=1$, the difference of the time series is defined as $(1-B)Y_t = Y_t - Y_{t-1}$, and $ARIMA(1, 1, 1)$ is then $(1-\phi B)(1-B)Y_t = (1+\theta_1B)\epsilon_t$.
Consider $(1-0.2B)(1-B)Y_t = (1+0.3B)\epsilon_t$. It can be expanded as:
$(Y_t - Y_{t-1}) - 0.2(Y_{t-1} - Y_{t-2}) = \epsilon_t + 0.3\epsilon_{t-1}$
$Y_t - Y_{t-1} = 0.2(Y_{t-1} - Y_{t-2}) + 0.3\epsilon_{t-1} + \epsilon_{t}$
$Y_t = Y_{t-1} + 0.2(Y_{t-1} - Y_{t-2}) + 0.3\epsilon_{t-1} + \epsilon_{t}$
An example chart of this process is shown below:
Difference is useful when the time series is not stationary. See stationary timeseries for more details.
SARIMA (Seasonal Autoregressive Integrated Moving Average) model, specifically $ARIMA(p, d, q)(P, D, Q)_s$ models a time series as combination of non-seasonality as $ARIMA(p, d, q)$ and seasonality as $ARIMA(P, D, Q)_s$.
Let's consider the simple form of $ARIMA(1, 1, 1)(1, 1, 1)_4$. $ARIMA(1, 1, 1)$ deals with $Y_t - Y_{t-1}$, while $(1, 1, 1)_4$ handles $Y_t - Y_{t-4}$.
As a result, it is defined as $(1-\phi_1 B)(1 - \Phi_1 B^4)(1-B)(1-B^4)Y_t$ $= (1+\theta_1B)(1+\Theta_1B^4)\epsilon_t$
For the demonstration of ARIMA, we're going to use Air Passengers Dataset. As for the model, we use $ARIMA(1, 1, 0)(0, 1, 0)_{12}$ over log transformed air passenger data. Since ARIMA supports differentiation inherently, no difference was taken.
In the chart, for the very first value and about a year later, there are predictions that went off significantly. They're due to the initial value being 0 in ARIMA due to diff. Otherwise, we see that ARIMA is fitting the data well and also performing well in forecasting into the future.
Will be updated soon!