Intro to ARIMA models

6 Apr 2023

Big picture

Let’s imagine that we can describe our data as a combination of the mean trend (\(m_t\)) and error.

\[x_t = m_t + e_t\]

Fisheries biologists and ecologists often want to know \(m_t\). In fact, that is often our main and sole goal.

But let’s say we don’t care about \(m_t\). Our only goal is to predict \(x_{t+1}\).

How could we do this?

The Box-Jenkins approach (ARIMA models) is totally different.

Keep differencing the data to until you get a new transformed stationary time series \(\Delta^d x_t\)
Any stationary time series can be modeled as a ARMA process (Wold Decomposition). So now fit a ARMA model to \(\Delta^d x_t\)
Using the estimated ARMA, predict \(\Delta^d x_{t+1}\)
Using \(\Delta^d x_{t+1}\), \(x_t\), \(x_{t-1}\), \(x_{t-2}\), etc, you can compute \(x_{t+1}\)
That’s the prediction!

\[\Delta^1 x_t = x_t - x_{t-1}\] \[\Delta^2 x_t = \Delta^1 x_t - \Delta^1 x_{t-1}\] \[\Delta^3 x_t = \Delta^2 x_t - \Delta^2 x_{t-1}\]

In this approach to predicting \(x_{t+1}\), we remove \(m_t\) from our data using differencing.

We don’t have to worry about a model for \(m_t\) because we have removed it!!

You can remove any wiggly trend with enough differencing.

The error structure of \(\Delta^d x_{t+1}\) is NOT the same as \(e_t\).

\[\Delta^d x_{t} = \phi_1\Delta^d x_{t-1} + \phi_2\Delta^d x_{t-2} + \dots + z_t\]

\(z_t\) is the error of the differences. And the \(\phi\) in the AR part are for the differences not the original \(x_t\).

But remember, the objective was to predict \(x_{t+1}\) not to fit a model with a biological interpretation.

ARIMA models are one approach for fitting data that have underlying trends.

Other approaches

Regression (we won’t cover this)
Dynamic Linear Regression (we will cover this)
Stochastic level models (we will do a lot of variants of this in class)
ARMAX models: \(x_t = b x_{t-1} + \beta \text{covariates} + \text{error}\) (we will do some of this)