In this chapter, you will practice doing fitting an ARIMA model to catch data using the Box-Jenkins method. You will prepare simple forecasts using the forecast package.
Load catch landings from Greek waters landings.RData and the Chinook landings chinook.RData in Washington data.
load("landings.RData")
load("chinook.RData")
Ensure you have the necessary packages.
library(ggplot2)
library(gridExtra)
library(reshape2)
library(tseries)
library(urca)
library(forecast)
A. Model form selection
B. Parameter estimation
C. Model checking
It is important to test and transform (via differencing) your data to ensure stationarity when fitting an ARMA model using standard algorithms. The standard algorithms for ARIMA models assume stationarity and we will be using those algorithms. It possible to fit ARMA models without transforming the data. We will cover that in later chapters. However, that is not commonly done in the literature on forecasting with ARMA models, certainly not in the literature on catch forecasting.
Keep in mind also that many ARMA models are stationary and you do not want to get in the situation of trying to fit an incompatible process model to your data. We will see examples of this when we start fitting models to non-stationary data and random walks.
We will start by looking at white noise and a stationary AR(1) process from simulated data. White noise is simply a string of random numbers drawn from a Normal distribution. rnorm()
with return random numbers drawn from a Normal distribution. Use ?rnorm
to understand what the function requires.
TT=100
y = rnorm(TT, mean=0, sd=1) # 100 random numbers
op=par(mfrow=c(1,2))
plot(y, type="l")
acf(y)
par(op)
Here we use ggplot()
to plot 10 white noise time series.
dat = data.frame(t=1:TT, y=y)
p1 = ggplot(dat, aes(x=t, y=y)) + geom_line() +
ggtitle("1 white noise time series") + xlab("") + ylab("value")
ys = matrix(rnorm(TT*10),TT,10)
ys = data.frame(ys)
ys$id = 1:TT
ys2=melt(ys, id.var="id")
p2 = ggplot(ys2, aes(x=id,y=value,group=variable)) +
geom_line() + xlab("") + ylab("value") +
ggtitle("10 white noise processes")
grid.arrange(p1, p2, ncol = 1)
These are stationary because the variance and mean (level) does not change with time.
An AR(1) process is also stationary.
theta=0.8
nsim=10
ar1=arima.sim(TT, model=list(ar=theta))
plot(ar1)
We can use ggplot to plot 10 AR(1) time series, but we need to change the data to a data frame.
dat = data.frame(t=1:TT, y=ar1)
p1 = ggplot(dat, aes(x=t, y=y)) + geom_line() +
ggtitle("AR-1") + xlab("") + ylab("value")
ys = matrix(0,TT,nsim)
for(i in 1:nsim) ys[,i]=as.vector(arima.sim(TT, model=list(ar=theta)))
ys = data.frame(ys)
ys$id = 1:TT
ys2=melt(ys, id.var="id")
p2 = ggplot(ys2, aes(x=id,y=value,group=variable)) +
geom_line() + xlab("") + ylab("value") +
ggtitle("The variance of an AR-1 process is steady")
grid.arrange(p1, p2, ncol = 1)
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Fluctuating around a linear trend is a very common type of stationarity used in ARMA modeling and forecasting. This is just a stationary process, like white noise or AR(1), around an linear trend up or down.
intercept = .5
trend=.1
sd=0.5
TT=20
wn=rnorm(TT, sd=sd) #white noise
wni = wn+intercept #white noise witn interept
wnti = wn + trend*(1:TT) + intercept
See how the white noise with trend is just the white noise overlaid on a linear trend.
op=par(mfrow=c(1,3))
plot(wn, type="l")
plot(trend*1:TT)
plot(wnti, type="l")
par(op)
We can make a similar plot with ggplot.
dat = data.frame(t=1:TT, wn=wn, wni=wni, wnti=wnti)
p1 = ggplot(dat, aes(x=t, y=wn)) + geom_line() + ggtitle("White noise")
p2 = ggplot(dat, aes(x=t, y=wni)) + geom_line() + ggtitle("with non-zero mean")
p3 = ggplot(dat, aes(x=t, y=wnti)) + geom_line() + ggtitle("with linear trend")
grid.arrange(p1, p2, p3, ncol = 3)
We can make a similar plot with AR(1) data. Ignore the warnings about not knowing how to pick the scale.
beta1 <- 0.8
ar1 <- arima.sim(TT, model=list(ar=beta1), sd=sd)
ar1i <- ar1 + intercept
ar1ti <- ar1 + trend*(1:TT) + intercept
dat = data.frame(t=1:TT, ar1=ar1, ar1i=ar1i, ar1ti=ar1ti)
p4 = ggplot(dat, aes(x=t, y=ar1)) + geom_line() + ggtitle("AR1")
p5 = ggplot(dat, aes(x=t, y=ar1i)) + geom_line() + ggtitle("with non-zero mean")
p6 = ggplot(dat, aes(x=t, y=ar1ti)) + geom_line() + ggtitle("with linear trend")
grid.arrange(p4, p5, p6, ncol = 3)
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
We will look at the anchovy data. Notice the two ==
in the subset call not one =
. We will use the Greek data before 1989 for the lab.
anchovy <- subset(landings, Species=="Anchovy" & Year <= 1989)$log.metric.tons
anchovyts <- ts(anchovy, start=1964)
Plot the data.
plot(anchovyts, ylab="log catch")
Questions to ask.
The null hypothesis for the Dickey-Fuller test is that the data are non-stationary. We want to REJECT the null hypothesis for this test, so we want a p-value of less that 0.05 (or smaller).
Let’s start by doing the test on data that we know are stationary, white noise. We will use adf.test()
from tseries. Use ?adf.test
to read about this function. The default values are generally fine.
TT <- 100
wn <- rnorm(TT) # white noise
tseries::adf.test(wn)
## Warning in tseries::adf.test(wn): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: wn
## Dickey-Fuller = -5.3417, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
The null hypothesis is rejected.
Try the test on white noise with a trend and intercept.
intercept <- 1
wnt <- wn + 1:TT + intercept
tseries::adf.test(wnt)
## Warning in tseries::adf.test(wnt): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: wnt
## Dickey-Fuller = -5.3417, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
The null hypothesis is still rejected. adf.test()
uses a model that allows an intercept and trend.
Let’s try the test on a random walk (nonstationary).
rw <- cumsum(rnorm(TT))
tseries::adf.test(rw)
##
## Augmented Dickey-Fuller Test
##
## data: rw
## Dickey-Fuller = -1.1281, Lag order = 4, p-value = 0.9143
## alternative hypothesis: stationary
The null hypothesis is NOT reject as the p-value is greater than 0.05.
tseries::adf.test(anchovyts)
##
## Augmented Dickey-Fuller Test
##
## data: anchovyts
## Dickey-Fuller = -1.6851, Lag order = 2, p-value = 0.6923
## alternative hypothesis: stationary
The p-value is greater than 0.05. We cannot reject the null hypothesis. The null hypothesis is that the data are non-stationary.
The null hypothesis for the KPSS test is that the data are stationary. For this test, we do NOT want to reject the null hypothesis. In other words, we want the p-value to be greater than 0.05 not less than 0.05.
Let’s try the KPSS test on white noise with a trend. The default is a null hypothesis with no trend. We will change this to null="Trend"
.
tseries::kpss.test(wnt, null="Trend")
## Warning in tseries::kpss.test(wnt, null = "Trend"): p-value greater than
## printed p-value
##
## KPSS Test for Trend Stationarity
##
## data: wnt
## KPSS Trend = 0.032007, Truncation lag parameter = 4, p-value = 0.1
The p-value is greater than 0.05. The null hypothesis of stationarity around a trend is not rejected.
Let’s try the KPSS test on white noise with a trend but let’s use the default of stationary with no trend.
tseries::kpss.test(wnt, null="Level")
## Warning in tseries::kpss.test(wnt, null = "Level"): p-value smaller than
## printed p-value
##
## KPSS Test for Level Stationarity
##
## data: wnt
## KPSS Level = 2.099, Truncation lag parameter = 4, p-value = 0.01
The p-value is less than 0.05. The null hypothesis of stationarity around a level is rejected. This is white noise around a trend so it is definitely a stationary process but has a trend. This illustrates that you need to be thoughtfull when applying stationarity tests.
Let’s try the anchovy data.
kpss.test(anchovyts, null="Trend")
##
## KPSS Test for Trend Stationarity
##
## data: anchovyts
## KPSS Trend = 0.14779, Truncation lag parameter = 2, p-value =
## 0.04851
The null is rejected (p-value less than 0.05). Again stationarity is not supported.
ur.df()
The ur.df()
Augmented Dickey-Fuller test in the urca package gives us a bit more information on and control over the test.
The Dickey-Fuller test is testing if \(\phi=0\) in this model of the data: \[z_t = \alpha + \beta t + \phi z_{t-1} + e_t\] which is written as \[\Delta z_t = z_t-z_{t-1}= \alpha + \beta t + \gamma z_{t-1} + e_t\] where \(z_t\) is your data. I used \(z\) since ur.df()
denotes the data as z
. It is written this way so we can do a linear regression test.
lm(z.diff ~ z.lag.1 + 1 + tt)
and test if z.lag.1
is 0.
The Augmented Dickey-Fuller test allows for higher-order autoregressive processes by including \(\Delta z_{t-p}\) in the model. But our test is still if \(\gamma = 0\). \[\Delta z_t = \alpha + \beta t + \gamma z_{t-1} + \delta_1 \Delta z_{t-1} + \delta_2 \Delta z_{t-2} + \dots\]
Let’s first do the test on data we know is stationary, white noise. We have to make choose the type
and lags
. If you have no particular reason to not include an intercept and trend, then use type="trend"
. This allows both intercept and trend. Next you need to chose the lags
. We will use lags=0
to do the Dickey-Fuller test. Note the number of lags you can test will depend on the amount of data that you have.
lag=0
is fitting this model to the data. You are testing if the effect for z.lag.1
is 0.
z.diff = gamma * z.lag.1 + intercept + trend * tt
If you see ***
or **
on the coefficients list for z.lag.1
, it indicates that the effect of z.lag.1
is significantly different than 0 and this supports the assumption of stationarity.
The intercept
and tt
estimates indicate where there is a non-zero level (intercept) or linear trend (tt).
wn <- rnorm(TT)
test <- urca::ur.df(wn, type="trend", lags=0)
summary(test)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.94739 -0.64530 0.01189 0.70947 2.11466
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.295317 0.178081 1.658 0.1005
## z.lag.1 -1.022750 0.097482 -10.492 <2e-16 ***
## tt -0.006192 0.003107 -1.993 0.0491 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8745 on 96 degrees of freedom
## Multiple R-squared: 0.5347, Adjusted R-squared: 0.5251
## F-statistic: 55.17 on 2 and 96 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -10.4916 36.8007 55.1693
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2 6.50 4.88 4.16
## phi3 8.73 6.49 5.47
The coefficient part of the summary indicates that z.lag.1
is different than 0 (so stationary) an no support for intercept or trend.
Notice that the test statistic is LESS than the critical value for tau3
at 5 percent. This means the null hypothesis is rejected at \(\alpha=0.05\), a standard level for significance testing.
ur.df()
If you remove the trend (and/or level) from your data, the ur.df()
test allows you to increase the power of the test by removing the trend and/or level from the model. In addition, ur.df()
allows you to select the number of lags to use via model selection. That way you don’t use a stationarity test that is too complex to be supported by your data length.
The anchovy data have failed both tests for the stationarity, the Augmented Dickey-Fuller and the KPSS test. How to we fix this? The standard approach is to use differencing.
Let’s see how this works with random walk data. A random walk is non-stationary but the difference is white noise so is stationary:
\[x_t - x_{t-1} = e_t, e_t \sim N(0,\sigma)\]
adf.test(diff(rw))
##
## Augmented Dickey-Fuller Test
##
## data: diff(rw)
## Dickey-Fuller = -3.94, Lag order = 4, p-value = 0.01503
## alternative hypothesis: stationary
kpss.test(diff(rw))
## Warning in kpss.test(diff(rw)): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: diff(rw)
## KPSS Level = 0.33583, Truncation lag parameter = 3, p-value = 0.1
If we difference random walk data, the null is rejected for the ADF test and not rejected for the KPSS test. This is what we want.
Let’s try a single difference with the anchovy data. A single difference means dat(t)-dat(t-1)
. We get this using diff(anchovyts)
.
diff1dat <- diff(anchovyts)
adf.test(diff1dat)
##
## Augmented Dickey-Fuller Test
##
## data: diff1dat
## Dickey-Fuller = -3.2718, Lag order = 2, p-value = 0.09558
## alternative hypothesis: stationary
kpss.test(diff1dat)
## Warning in kpss.test(diff1dat): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: diff1dat
## KPSS Level = 0.089671, Truncation lag parameter = 2, p-value = 0.1
If a first difference were not enough, we would try a second difference which is the difference of a first difference.
diff2dat <- diff(diff1dat)
adf.test(diff2dat)
## Warning in adf.test(diff2dat): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: diff2dat
## Dickey-Fuller = -4.8234, Lag order = 2, p-value = 0.01
## alternative hypothesis: stationary
The null hypothesis of a random walk is now rejected so you might think that a 2nd difference is needed for the anchovy data. However the actual problem is that the default for adf.test()
includes a trend but we removed the trend with our first difference. Thus we included an unneeded trend parameter in our test. Our data are not that long and this affects the result.
Let’s repeat without the trend and we’ll see that the null hypothesis is rejected. The number of lags is set to be what would be used by adf.test()
. See ?adf.test
.
k <- trunc((length(diff1dat)-1)^(1/3))
test <- urca::ur.df(diff1dat, type="drift", lags=k)
summary(test)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.37551 -0.13887 0.04753 0.13277 0.28223
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.11062 0.06165 1.794 0.08959 .
## z.lag.1 -2.16711 0.64900 -3.339 0.00365 **
## z.diff.lag1 0.58837 0.47474 1.239 0.23113
## z.diff.lag2 0.13273 0.25299 0.525 0.60623
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.207 on 18 degrees of freedom
## Multiple R-squared: 0.7231, Adjusted R-squared: 0.677
## F-statistic: 15.67 on 3 and 18 DF, p-value: 2.918e-05
##
##
## Value of test-statistic is: -3.3391 5.848
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.75 -3.00 -2.63
## phi1 7.88 5.18 4.12
ndiffs()
As an alternative to trying many different differences and remembering to include or not include the trend or level, you can use the ndiffs()
function in the forecast package. This automates finding the number of differences needed.
forecast::ndiffs(anchovyts, test="kpss")
## [1] 1
forecast::ndiffs(anchovyts, test="adf")
## [1] 1
One difference is required to pass both the ADF and KPSS stationarity tests.
The basic stationarity diagnostics are the following
adf.test()
p-value should be less than 0.05 (reject null)kpss.test()
p-value should be greater than 0.05 (do not reject null)ndiffs()
in the forecast package or manually try different differences.Let’s start with fitting to simulated data.
Simulate AR(2) data and add a mean level so that the data are not mean 0.
\[x_t = 0.8 x_{t-1} + 0.1 x_{t-2} + e_t\] \[y_t = x_t + m\]
m <- 1
ar2 <- arima.sim(n=100, model=list(ar=c(.8,.1))) + m
To see info on arima.sim()
, type ?arima.sim
.
This is not this model
\[y_t = m + 0.8 y_{t-1} + 0.1 y_{t-2} + e_t\] It is this model
\[y_t = (m-0.8 m-0.1 m) + 0.8 y_{t-1} + 0.1 y_{t-2} + e_t\]
Fit an ARMA(2) with level to the data.
forecast::Arima(ar2, order=c(2,0,0), include.constant=TRUE)
## Series: ar2
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 mean
## 0.7227 0.0073 1.6324
## s.e. 0.0997 0.1015 0.3590
##
## sigma^2 estimated as 1.008: log likelihood=-141.13
## AIC=290.25 AICc=290.67 BIC=300.67
Simulate AR(1) data:
\[x_t = 0.8 x_{t-1} + e_t\]
We could also use arima()
.
arima(ar2, order=c(2,0,0), include.mean=TRUE)
##
## Call:
## arima(x = ar2, order = c(2, 0, 0), include.mean = TRUE)
##
## Coefficients:
## ar1 ar2 intercept
## 0.7227 0.0073 1.6324
## s.e. 0.0997 0.1015 0.3590
##
## sigma^2 estimated as 0.9773: log likelihood = -141.13, aic = 290.25
However we will not be using arima()
directly because for differenced data, it will not allow us to include and estimated mean level. Unless we have done transformed our differenced data in a way that ensures it is mean zero, then want to include a mean.
Try increasing the length of the simulated data (from 100 to 1000 say) and see how that affects your parameter estimates. Run the simulation a few times.
ar1 = arima.sim(n=100, model=list(ar=c(.8)))+m
forecast::Arima(ar1, order=c(1,0,0), include.constant=TRUE)
## Series: ar1
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 mean
## 0.6785 0.7866
## s.e. 0.0722 0.3131
##
## sigma^2 estimated as 1.077: log likelihood=-144.9
## AIC=295.8 AICc=296.05 BIC=303.62
Simulate ARMA(1,2) \[x_t = 0.8 x_{t-1} + e_t + 0.8 e_{t-1} + 0.2 e_{t-2}\]
arma12 = arima.sim(n=100, model=list(ar=c(0.8), ma=c(0.8, 0.2)))+m
forecast::Arima(arma12, order=c(1,0,2), include.constant=TRUE)
## Series: arma12
## ARIMA(1,0,2) with non-zero mean
##
## Coefficients:
## ar1 ma1 ma2 mean
## 0.7943 0.8436 0.1448 0.8376
## s.e. 0.0731 0.1271 0.1144 0.8198
##
## sigma^2 estimated as 0.8141: log likelihood=-131.05
## AIC=272.09 AICc=272.73 BIC=285.12
We will up the number of data points to 1000 because models with a MA component take a lot of data to estimate. Models with MA(>1) are not very practical for fisheries data for that reason.
Create some AR(2) data and then add missing values (NA).
ar2miss = arima.sim(n=100, model=list(ar=c(.8,.1)))
ar2miss[sample(100,50)]=NA
plot(ar2miss, type="l")
title("many missing values")
Fit
fit <- forecast::Arima(ar2miss, order=c(2,0,0))
fit
## Series: ar2miss
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 mean
## 1.2337 -0.2955 -1.3749
## s.e. 0.1526 0.1552 1.0104
##
## sigma^2 estimated as 0.2342: log likelihood=-69.63
## AIC=147.26 AICc=147.68 BIC=157.68
Note fitted()
does not return the expected value at time \(t\). It is the expected value of \(y_t\) given the data up to time \(t-1\).
plot(ar2miss, type="l")
title("many missing values")
lines(fitted(fit), col="blue")
It is easy enough to get the expected value of \(y_t\) for all the missing values but we’ll learn to do that when we learn the MARSS package and can apply the Kalman Smoother in that package.
We will use the auto.arima()
function in forecast. This function will estimate the level of differencing needed to make our data stationary and estimate the AR and MA orders using AICc (or BIC if we choose).
forecast::auto.arima(ar2)
## Series: ar2
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 mean
## 0.7278 1.6344
## s.e. 0.0694 0.3553
##
## sigma^2 estimated as 0.9973: log likelihood=-141.13
## AIC=288.26 AICc=288.51 BIC=296.07
Works with missing data too though might not estimate very close to the true model form.
forecast::auto.arima(ar2miss)
## Series: ar2miss
## ARIMA(2,0,0) with zero mean
##
## Coefficients:
## ar1 ar2
## 1.2289 -0.2715
## s.e. 0.1553 0.1575
##
## sigma^2 estimated as 0.2369: log likelihood=-70.29
## AIC=146.58 AICc=146.83 BIC=154.4
Let’s fit to 100 simulated data sets and see how often the true (generating) model form is selected.
save.fits = rep(NA,100)
for(i in 1:100){
a2 = arima.sim(n=100, model=list(ar=c(.8,.1)))
fit = auto.arima(a2, seasonal=FALSE, max.d=0, max.q=0)
save.fits[i] = paste0(fit$arma[1], "-", fit$arma[2])
}
table(save.fits)
## save.fits
## 1-0 2-0 3-0 4-0
## 72 21 6 1
auto.arima()
uses AICc for selection by default. You can change that to AIC or BIC using ic="aic"
or ic="bic"
.
Repeat the simulation using AIC and BIC to see how the choice of the information criteria affects the model that is selected.
We can set Trace=TRUE
to see what models auto.arima()
fit.
forecast::auto.arima(ar2, trace=TRUE)
##
## ARIMA(2,0,2) with non-zero mean : 293.1098
## ARIMA(0,0,0) with non-zero mean : 358.9859
## ARIMA(1,0,0) with non-zero mean : 288.5059
## ARIMA(0,0,1) with non-zero mean : 317.1314
## ARIMA(0,0,0) with zero mean : 447.349
## ARIMA(2,0,0) with non-zero mean : 290.6719
## ARIMA(1,0,1) with non-zero mean : 290.6729
## ARIMA(2,0,1) with non-zero mean : 291.6257
## ARIMA(1,0,0) with zero mean : 294.7307
##
## Best model: ARIMA(1,0,0) with non-zero mean
## Series: ar2
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 mean
## 0.7278 1.6344
## s.e. 0.0694 0.3553
##
## sigma^2 estimated as 0.9973: log likelihood=-141.13
## AIC=288.26 AICc=288.51 BIC=296.07
We can set stepwise=FALSE
to use and exhaustive search. The model may be different than the result from the non-exhaustive search.
forecast::auto.arima(ar2, trace=TRUE, stepwise=FALSE)
##
## ARIMA(0,0,0) with zero mean : 447.349
## ARIMA(0,0,0) with non-zero mean : 358.9859
## ARIMA(0,0,1) with zero mean : 378.1637
## ARIMA(0,0,1) with non-zero mean : 317.1314
## ARIMA(0,0,2) with zero mean : 338.6311
## ARIMA(0,0,2) with non-zero mean : 298.313
## ARIMA(0,0,3) with zero mean : 328.9559
## ARIMA(0,0,3) with non-zero mean : 297.192
## ARIMA(0,0,4) with zero mean : 314.3129
## ARIMA(0,0,4) with non-zero mean : 292.1388
## ARIMA(0,0,5) with zero mean : 314.3914
## ARIMA(0,0,5) with non-zero mean : 294.4504
## ARIMA(1,0,0) with zero mean : 294.7307
## ARIMA(1,0,0) with non-zero mean : 288.5059
## ARIMA(1,0,1) with zero mean : 295.9157
## ARIMA(1,0,1) with non-zero mean : 290.6729
## ARIMA(1,0,2) with zero mean : 298.0796
## ARIMA(1,0,2) with non-zero mean : 292.1413
## ARIMA(1,0,3) with zero mean : 298.6499
## ARIMA(1,0,3) with non-zero mean : 294.4062
## ARIMA(1,0,4) with zero mean : 300.7984
## ARIMA(1,0,4) with non-zero mean : 294.4465
## ARIMA(2,0,0) with zero mean : 295.9174
## ARIMA(2,0,0) with non-zero mean : 290.6719
## ARIMA(2,0,1) with zero mean : 297.0567
## ARIMA(2,0,1) with non-zero mean : 291.6257
## ARIMA(2,0,2) with zero mean : 299.2656
## ARIMA(2,0,2) with non-zero mean : 293.1098
## ARIMA(2,0,3) with zero mean : 299.8459
## ARIMA(2,0,3) with non-zero mean : 295.338
## ARIMA(3,0,0) with zero mean : 298.0854
## ARIMA(3,0,0) with non-zero mean : 291.9799
## ARIMA(3,0,1) with zero mean : 299.2678
## ARIMA(3,0,1) with non-zero mean : 293.1577
## ARIMA(3,0,2) with zero mean : Inf
## ARIMA(3,0,2) with non-zero mean : Inf
## ARIMA(4,0,0) with zero mean : 299.2006
## ARIMA(4,0,0) with non-zero mean : 294.184
## ARIMA(4,0,1) with zero mean : 300.6301
## ARIMA(4,0,1) with non-zero mean : 295.4663
## ARIMA(5,0,0) with zero mean : 301.4643
## ARIMA(5,0,0) with non-zero mean : 295.9326
##
##
##
## Best model: ARIMA(1,0,0) with non-zero mean
## Series: ar2
## ARIMA(1,0,0) with non-zero mean
##
## Coefficients:
## ar1 mean
## 0.7278 1.6344
## s.e. 0.0694 0.3553
##
## sigma^2 estimated as 0.9973: log likelihood=-141.13
## AIC=288.26 AICc=288.51 BIC=296.07
fit <- auto.arima(anchovyts)
fit
## Series: anchovyts
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## -0.6685 0.0542
## s.e. 0.1977 0.0142
##
## sigma^2 estimated as 0.04037: log likelihood=5.39
## AIC=-4.79 AICc=-3.65 BIC=-1.13
Note arima()
writes a MA model like:
\[x_t = e_t + b_1 e_{t-1} + b_2 e_{t-2}\]
while many authors use this notation:
\[x_t = e_t - \theta_1 e_{t-1} - \theta_2 e_{t-2}\]
so the MA parameters reported by auto.arima()
will be NEGATIVE of that reported in Stergiou and Christou (1996) who analyze these same data. Note, in Stergiou and Christou, the model is written in backshift notation on page 112. To see the model as the equation above, I translated from backshift to non-backshift notation.
We can do a test of autocorrelation of the residuals with Box.test()
with fitdf
adjusted for the number of parameters estimated in the fit. In our case, MA(1) and drift parameters.
res <- resid(fit)
Box.test(res, type="Ljung-Box", lag=12, fitdf=2)
##
## Box-Ljung test
##
## data: res
## X-squared = 5.1609, df = 10, p-value = 0.8802
checkresiduals()
in the forecast package will automate this test and show some standard diagnostics plots.
forecast::checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 1.0902, df = 3.2, p-value = 0.8087
##
## Model df: 2. Total lags used: 5.2
We can create a forecast from our anchovy ARIMA model using forecast()
. The shading is the 80% and 95% prediction intervals.
fr <- forecast::forecast(fit, h=10)
plot(fr)
The chinook data are monthly and start in January 1990. To make this into a ts object do
chinookts <- ts(chinook$log.metric.tons, start=c(1990,1),
frequency=12)
start
is the year and month and frequency is the number of months in the year.
Use ?ts
to see more examples of how to set up ts objects.
plot(chinookts)
auto.arima()
for seasonal tsauto.arima()
will recognize that our data has season and fit a seasonal ARIMA model to our data by default. Let’s define the training data up to 1998 and use 1999 as the test data.
traindat <- window(chinookts, c(1990,10), c(1998,12))
testdat <- window(chinookts, c(1999,1), c(1999,12))
fit <- forecast::auto.arima(traindat)
fit
## Series: traindat
## ARIMA(1,0,0)(0,1,0)[12] with drift
##
## Coefficients:
## ar1 drift
## 0.3676 -0.0320
## s.e. 0.1335 0.0127
##
## sigma^2 estimated as 0.758: log likelihood=-107.37
## AIC=220.73 AICc=221.02 BIC=228.13
Use ?window
to understand how subsetting a ts object works.
Forecasting works the same using the forecast()
function.
fr <- forecast::forecast(fit, h=12)
plot(fr)
points(testdat)
ur.df()
and adf.test()
. What do you need to set the lags to to achieve that?ur.df()
. How did you choose to set the lags?kpss.test()
.datdf <- subset(landings, Species=="Chub.mackerel" & Year<=1989)
dat <- ts(datdf$log.metric.tons, start=1964)
load("landings.RData")
datdf <- subset(landings, Species=="Anchovy")
dat <- ts(datdf$log.metric.tons, start=1964)
a. Do the conclusions regarding stationarity and the amount of differencing needed change?
b. Repeat with only the data after 1987. Are the conclusions different for only the recent data?
c. Fit the using `auto.arima()` to the full data (1964-2007) and recent data (1987 to present). Do the selected models change?
auto.arima()
trace using Arima()
and show that you can produce the same AICc value that is shown in the trace table.forecast::auto.arima(anchovy, trace=TRUE)
auto.arima()
with stepwise=FALSE
to find the top 3 models (according to AICc) for the anchovy.
Arima()
to fit the top 3 models to the data.auto.arima()
.