5.8 Estimating the ARMA orders
We will use the auto.arima()
function in forecast. This function will estimate the level of differencing needed to make our data stationary and estimate the AR and MA orders using AICc (or BIC if we choose).
5.8.1 Example: model selection for AR(2) data
::auto.arima(ar2) forecast
Series: ar2
ARIMA(2,0,2) with non-zero mean
Coefficients:
ar1 ar2 ma1 ma2 mean
0.2795 0.5938 0.4861 -0.0943 0.9553
s.e. 1.1261 1.0413 1.1284 0.1887 0.3398
sigma^2 estimated as 0.9848: log likelihood=-1409.57
AIC=2831.15 AICc=2831.23 BIC=2860.59
Works with missing data too though might not estimate very close to the true model form.
::auto.arima(ar2miss) forecast
Series: ar2miss
ARIMA(0,1,0)
sigma^2 estimated as 1.066: log likelihood=-82.07
AIC=166.15 AICc=166.19 BIC=168.72
5.8.2 Fitting to 100 simulated data sets
Let’s fit to 100 simulated data sets and see how often the true (generating) model form is selected.
<- rep(NA, 100)
save.fits for (i in 1:100) {
<- arima.sim(n = 100, model = list(ar = c(0.8, 0.1)))
a2 <- auto.arima(a2, seasonal = FALSE, max.d = 0, max.q = 0)
fit <- paste0(fit$arma[1], "-", fit$arma[2])
save.fits[i]
}table(save.fits)
save.fits
1-0 2-0 3-0
71 22 7
auto.arima()
uses AICc for selection by default. You can change that to AIC or BIC using ic="aic"
or ic="bic"
.
Repeat the simulation using AIC and BIC to see how the choice of the information criteria affects the model that is selected.
5.8.3 Trace=TRUE
We can set Trace=TRUE
to see what models auto.arima()
fit.
::auto.arima(ar2, trace = TRUE) forecast
Fitting models using approximations to speed things up...
ARIMA(2,0,2) with non-zero mean : 2824.88
ARIMA(0,0,0) with non-zero mean : 4430.868
ARIMA(1,0,0) with non-zero mean : 2842.785
ARIMA(0,0,1) with non-zero mean : 3690.512
ARIMA(0,0,0) with zero mean : 4602.31
ARIMA(1,0,2) with non-zero mean : 2827.422
ARIMA(2,0,1) with non-zero mean : 2825.235
ARIMA(3,0,2) with non-zero mean : 2830.176
ARIMA(2,0,3) with non-zero mean : 2826.503
ARIMA(1,0,1) with non-zero mean : 2825.438
ARIMA(1,0,3) with non-zero mean : 2829.358
ARIMA(3,0,1) with non-zero mean : Inf
ARIMA(3,0,3) with non-zero mean : 2825.766
ARIMA(2,0,2) with zero mean : 2829.536
Now re-fitting the best model(s) without approximations...
ARIMA(2,0,2) with non-zero mean : 2831.232
Best model: ARIMA(2,0,2) with non-zero mean
Series: ar2
ARIMA(2,0,2) with non-zero mean
Coefficients:
ar1 ar2 ma1 ma2 mean
0.2795 0.5938 0.4861 -0.0943 0.9553
s.e. 1.1261 1.0413 1.1284 0.1887 0.3398
sigma^2 estimated as 0.9848: log likelihood=-1409.57
AIC=2831.15 AICc=2831.23 BIC=2860.59
5.8.4 stepwise=FALSE
We can set stepwise=FALSE
to use an exhaustive search. The model may be different than the result from the non-exhaustive search.
::auto.arima(ar2, trace = TRUE, stepwise = FALSE) forecast
Fitting models using approximations to speed things up...
ARIMA(0,0,0) with zero mean : 4602.31
ARIMA(0,0,0) with non-zero mean : 4430.868
ARIMA(0,0,1) with zero mean : 3815.931
ARIMA(0,0,1) with non-zero mean : 3690.512
ARIMA(0,0,2) with zero mean : 3425.037
ARIMA(0,0,2) with non-zero mean : 3334.754
ARIMA(0,0,3) with zero mean : 3239.347
ARIMA(0,0,3) with non-zero mean : 3170.541
ARIMA(0,0,4) with zero mean : 3114.265
ARIMA(0,0,4) with non-zero mean : 3059.938
ARIMA(0,0,5) with zero mean : 3042.136
ARIMA(0,0,5) with non-zero mean : 2998.531
ARIMA(1,0,0) with zero mean : 2850.655
ARIMA(1,0,0) with non-zero mean : 2842.785
ARIMA(1,0,1) with zero mean : 2830.652
ARIMA(1,0,1) with non-zero mean : 2825.438
ARIMA(1,0,2) with zero mean : 2832.668
ARIMA(1,0,2) with non-zero mean : 2827.422
ARIMA(1,0,3) with zero mean : 2834.675
ARIMA(1,0,3) with non-zero mean : 2829.358
ARIMA(1,0,4) with zero mean : 2835.539
ARIMA(1,0,4) with non-zero mean : 2829.825
ARIMA(2,0,0) with zero mean : 2828.987
ARIMA(2,0,0) with non-zero mean : 2823.774
ARIMA(2,0,1) with zero mean : 2829.952
ARIMA(2,0,1) with non-zero mean : 2825.235
ARIMA(2,0,2) with zero mean : 2829.536
ARIMA(2,0,2) with non-zero mean : 2824.88
ARIMA(2,0,3) with zero mean : 2831.461
ARIMA(2,0,3) with non-zero mean : 2826.503
ARIMA(3,0,0) with zero mean : 2831.057
ARIMA(3,0,0) with non-zero mean : 2826.236
ARIMA(3,0,1) with zero mean : Inf
ARIMA(3,0,1) with non-zero mean : Inf
ARIMA(3,0,2) with zero mean : 2834.788
ARIMA(3,0,2) with non-zero mean : 2830.176
ARIMA(4,0,0) with zero mean : 2833.323
ARIMA(4,0,0) with non-zero mean : 2828.759
ARIMA(4,0,1) with zero mean : 2827.798
ARIMA(4,0,1) with non-zero mean : 2823.853
ARIMA(5,0,0) with zero mean : 2835.315
ARIMA(5,0,0) with non-zero mean : 2830.501
Now re-fitting the best model(s) without approximations...
Best model: ARIMA(2,0,0) with non-zero mean
Series: ar2
ARIMA(2,0,0) with non-zero mean
Coefficients:
ar1 ar2 mean
0.7684 0.1387 0.9561
s.e. 0.0314 0.0314 0.3332
sigma^2 estimated as 0.9832: log likelihood=-1409.77
AIC=2827.54 AICc=2827.58 BIC=2847.17
5.8.5 Fit to the anchovy data
<- auto.arima(anchovyts)
fit fit
Series: anchovyts
ARIMA(0,1,1) with drift
Coefficients:
ma1 drift
-0.6685 0.0542
s.e. 0.1977 0.0142
sigma^2 estimated as 0.04037: log likelihood=5.39
AIC=-4.79 AICc=-3.65 BIC=-1.13
Note arima()
writes a MA model like:
\[x_t = e_t + b_1 e_{t-1} + b_2 e_{t-2}\]
while many authors use this notation:
\[x_t = e_t - \theta_1 e_{t-1} - \theta_2 e_{t-2}\]
so the MA parameters reported by auto.arima()
will be NEGATIVE of that reported in Stergiou and Christou (1996) who analyze these same data. Note, in Stergiou and Christou, the model is written in backshift notation on page 112. To see the model as the equation above, I translated from backshift to non-backshift notation.