5.3 Dickey-Fuller and Augmented Dickey-Fuller tests
5.3.1 Dickey-Fuller test
The Dickey-Fuller test is testing if \(\phi=0\) in this model of the data: \[y_t = \alpha + \beta t + \phi y_{t-1} + e_t\] which is written as \[\Delta y_t = y_t-y_{t-1}= \alpha + \beta t + \gamma y_{t-1} + e_t\] where \(y_t\) is your data. It is written this way so we can do a linear regression of \(\Delta y_t\) against \(t\) and \(y_{t-1}\) and test if \(\gamma\) is different from 0. If \(\gamma=0\), then we have a random walk process. If not and \(-1<1+\gamma<1\), then we have a stationary process.
5.3.2 Augmented Dickey-Fuller test
The Augmented Dickey-Fuller test allows for higher-order autoregressive processes by including \(\Delta y_{t-p}\) in the model. But our test is still if \(\gamma = 0\). \[\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \delta_2 \Delta y_{t-2} + \dots\]
The null hypothesis for both tests is that the data are non-stationary. We want to REJECT the null hypothesis for this test, so we want a p-value of less that 0.05 (or smaller).
5.3.3 ADF test using adf.test()
The adf.test()
from the tseries package will do a Augmented Dickey-Fuller test (Dickey-Fuller if we set lags equal to 0) with a trend and an intercept. Use ?adf.test
to read about this function. The function is
adf.test(x, alternative = c("stationary", "explosive"),
k = trunc((length(x)-1)^(1/3)))
x
are your data. alternative="stationary"
means that \(-2<\gamma<0\) (\(-1<\phi<1\)) and alternative="explosive"
means that is outside these bounds. k
is the number of \(\delta\) lags. For a Dickey-Fuller test, so only up to AR(1) time dependency in our stationary process, we set k=0
so we have no \(\delta\)’s in our test. Being able to control the lags in our test, allows us to avoid a stationarity test that is too complex to be supported by our data.
5.3.3.1 Test on white noise
Let’s start by doing the test on data that we know are stationary, white noise. We will use an Augmented Dickey-Fuller test where we use the default number of lags (amount of time-dependency) in our test. For a time-series of 100, this is 4.
<- 100
TT <- rnorm(TT) # white noise
wn ::adf.test(wn) tseries
Warning in tseries::adf.test(wn): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: wn
Dickey-Fuller = -4.8309, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary
The null hypothesis is rejected.
Try a Dickey-Fuller test. This is testing with a null hypothesis of AR(1) stationarity versus a null hypothesis with AR(4) stationarity when we used the default k
.
::adf.test(wn, k = 0) tseries
Warning in tseries::adf.test(wn, k = 0): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: wn
Dickey-Fuller = -10.122, Lag order = 0, p-value = 0.01
alternative hypothesis: stationary
Notice that the test-statistic is smaller. This is a more restrictive test and we can reject the null with a higher significance level.
5.3.3.2 Test on white noise with trend
Try the test on white noise with a trend and intercept.
<- 1
intercept <- wn + 1:TT + intercept
wnt ::adf.test(wnt) tseries
Warning in tseries::adf.test(wnt): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: wnt
Dickey-Fuller = -4.8309, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary
The null hypothesis is still rejected. adf.test()
uses a model that allows an intercept and trend.
5.3.3.3 Test on random walk
Let’s try the test on a random walk (nonstationary).
<- cumsum(rnorm(TT))
rw ::adf.test(rw) tseries
Augmented Dickey-Fuller Test
data: rw
Dickey-Fuller = -2.3038, Lag order = 4, p-value = 0.4508
alternative hypothesis: stationary
The null hypothesis is NOT rejected as the p-value is greater than 0.05.
Try a Dickey-Fuller test.
::adf.test(rw, k = 0) tseries
Augmented Dickey-Fuller Test
data: rw
Dickey-Fuller = -1.7921, Lag order = 0, p-value = 0.6627
alternative hypothesis: stationary
Notice that the test-statistic is larger.
5.3.3.4 Test the anchovy data
::adf.test(anchovyts) tseries
Augmented Dickey-Fuller Test
data: anchovyts
Dickey-Fuller = -1.6851, Lag order = 2, p-value = 0.6923
alternative hypothesis: stationary
The p-value is greater than 0.05. We cannot reject the null hypothesis. The null hypothesis is that the data are non-stationary.
5.3.4 ADF test using ur.df()
The ur.df()
Augmented Dickey-Fuller test in the urca package gives us a bit more information on and control over the test.
ur.df(y, type = c("none", "drift", "trend"), lags = 1,
selectlags = c("Fixed", "AIC", "BIC"))
The ur.df()
function allows us to specify whether to test stationarity around a zero-mean with no trend, around a non-zero mean with no trend, or around a trend with an intercept. This can be useful when we know that our data have no trend, for example if you have removed the trend already. ur.df()
allows us to specify the lags or select them using model selection.
5.3.4.1 Test on white noise
Let’s first do the test on data we know is stationary, white noise. We have to choose the type
and lags
. If you have no particular reason to not include an intercept and trend, then use type="trend"
. This allows both intercept and trend. When you might you have a particular reason not to use "trend"
? When you have removed the trend and/or intercept.
Next you need to chose the lags
. We will use lags=0
to do the Dickey-Fuller test. Note the number of lags you can test will depend on the amount of data that you have. adf.test()
used a default of trunc((length(x)-1)^(1/3))
for the lags, but ur.df()
requires that you pass in a value or use a fixed default of 1.
lags=0
is fitting the following model to the data:
z.diff = gamma * z.lag.1 + intercept + trend * tt
z.diff
means \(\Delta y_t\) and z.lag.1
is \(y_{t-1}\). You are testing if the effect for z.lag.1
is 0.
When you use summary()
for the output from ur.df()
, you will see the estimated values for \(\gamma\) (denoted z.lag.1
), intercept and trend. If you see ***
or **
on the coefficients list for z.lag.1
, it suggest that the effect of z.lag.1
is significantly different than 0 and this supports the assumption of stationarity. However, the test level shown is for independent data not time series data. The correct test levels (critical values) are shown at the bottom of the summary output.
<- rnorm(TT)
wn <- urca::ur.df(wn, type = "trend", lags = 0)
test ::summary(test) urca
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression trend
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt)
Residuals:
Min 1Q Median 3Q Max
-2.2170 -0.6654 -0.1210 0.5311 2.6277
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0776865 0.2037709 0.381 0.704
z.lag.1 -1.0797598 0.1014244 -10.646 <2e-16 ***
tt 0.0004891 0.0035321 0.138 0.890
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.004 on 96 degrees of freedom
Multiple R-squared: 0.5416, Adjusted R-squared: 0.532
F-statistic: 56.71 on 2 and 96 DF, p-value: < 2.2e-16
Value of test-statistic is: -10.646 37.806 56.7083
Critical values for test statistics:
1pct 5pct 10pct
tau3 -4.04 -3.45 -3.15
phi2 6.50 4.88 4.16
phi3 8.73 6.49 5.47
Note urca::
in front of summary()
is needed if you have not loaded the urca package with library(urca)
.
We need to look at information at the bottom of the summary output for the test statistics and critical values. The part that looks like this
Value of test-statistic is: #1 #2 #3
Critical values for test statistics:
1pct 5pct 10pct
tau3 xxx xxx xxx
...
The first test statistic number is for \(\gamma=0\) and will be labeled tau
, tau2
or tau3
.
In our example with white noise, notice that the test statistic is LESS than the critical value for tau3
at 5 percent. This means the null hypothesis is rejected at \(\alpha=0.05\), a standard level for significance testing.
5.3.4.2 When you might want to use ur.df()
If you remove the trend (and/or level) from your data, the ur.df()
test allows you to increase the power of the test by removing the trend and/or level from the model.