Covariates
Why include covariates?
Multivariate linear regression on time series data
Covariates in MARSS models
Seasonality in MARSS models
Missing covariates
21 Feb 2019
Why include covariates?
Multivariate linear regression on time series data
Covariates in MARSS models
Seasonality in MARSS models
Missing covariates
What about ETS and covariates? Wouldn’t make sense.
Can you do a linear regression with time series data (response and predictors)? Yes, but you need to be careful. Read Chapter 5 in Hyndman and Athanasopoulos 2018
Imagine that your data looked like so where the line is the data and the color represents your covariate.
The xreg
argument in Arima()
and arima()
allows you to fit linear regressions with autocorrelated errors. Read Chapter 9 in Hyndman and Athanasopoulos 2018 on Dynamic Regression.
A linear regression with autocorrelated errors is for example:
\[y_t = \alpha + \beta d_t + \nu_t \\ \nu_t = \theta_1 \nu_{t-1} + \theta_2 \nu_{t-2} + e_t\]
Arima()
fit <- Arima(y, xreg=d, order=c(1,1,0))
auto.arima()
fit <- auto.arima(y, xreg=x)
y <- uschange[,"Consumption"]; d <- uschange[,"Income"] fit <- lm(y~d) checkresiduals(fit)
## ## Breusch-Godfrey test for serial correlation of order up to 10 ## ## data: Residuals ## LM test = 27.584, df = 10, p-value = 0.002104
fit <- Arima(y, xreg=d, order=c(1,0,0)) checkresiduals(fit)
## ## Ljung-Box test ## ## data: Residuals from Regression with ARIMA(1,0,0) errors ## Q* = 20.485, df = 5, p-value = 0.001013 ## ## Model df: 3. Total lags used: 8
auto.arima()
find best ARMA modelfit <- auto.arima(y, xreg=d) # It finds a ARMA(1,0,2) is best. checkresiduals(fit)
## ## Ljung-Box test ## ## data: Residuals from Regression with ARIMA(1,0,2) errors ## Q* = 5.8916, df = 3, p-value = 0.117 ## ## Model df: 5. Total lags used: 8
The is a big issue. If you are thinking about stepwise variable selection, do a literature search on the issue. Read the chapter in Holmes 2018: Chap 6 on catch forecasting models using multivariate regression for a discussion of
We are trying to explain the ERRORS with our covariates.
\[\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1} + \mathbf{C} \mathbf{c}_t + \mathbf{w}_t \\ \mathbf{y}_t = \mathbf{Z} \mathbf{x}_{t} + \mathbf{D} \mathbf{d}_t + \mathbf{v}_t\]