Univariate response
Stochastic level & growth
Dynamic Regression
Dynamic Regression with fixed season
Forecasting with a DLM
Model diagnostics
Multivariate response
25 April 2023
Univariate response
Stochastic level & growth
Dynamic Regression
Dynamic Regression with fixed season
Forecasting with a DLM
Model diagnostics
Multivariate response
You can find the R code for these lecture notes and other related exercises here.
Let’s begin with a linear regression model
\[ y_i = \alpha + \beta x_i + e_i ~ \text{with} ~ e_i \sim \text{N}(0,\sigma^2) \]
The index \(i\) has no explicit meaning in that shuffling (\(y_i,x_i\)) pairs has no effect on parameter estimation
We can write the model in matrix form
We can write the model in matrix form
with
\(\mathbf{X}^{\top}_i = \begin{bmatrix} 1 & x_i \end{bmatrix}\) and \(\boldsymbol{\theta} = \begin{bmatrix} \alpha & \beta \end{bmatrix}^{\top}\)
In a dynamic linear model, the regression parameters change over time, so we write \[ y_i = \mathbf{X}^{\top}_i \boldsymbol{\theta} + e_i ~~~~~~~ \text{(static)} \]
as
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t ~~~~~~~ \text{(dynamic)} \]
There are 2 important points here:
\[ y_\boxed{t} = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t \]
There are 2 important points here:
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_\boxed{t} + e_t \]
Subscript \(t\) explicitly acknowledges implicit info in the time ordering of the data in \(\mathbf{y}\)
The relationship between \(\mathbf{y}\) and \(\mathbf{X}\) is unique for every \(t\)
Close examination of the DLM reveals an apparent problem for parameter estimation
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t \]
Close examination of the DLM reveals an apparent problem for parameter estimation
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t \]
We only have 1 data point per time step (ie, \(y_t\) is a scalar)
Thus, we can only estimate 1 parameter (with no uncertainty)!
To address this issue, we’ll constrain the regression parameters to be dependent from \(t\) to \(t+1\)
\[ \boldsymbol{\theta}_t = \mathbf{G}_t \boldsymbol{\theta}_{t-1} + \mathbf{w}_t ~ \text{with} ~ \mathbf{w}_t \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \]
In practice, we often make \(\mathbf{G}_t\) time invariant
\[ \boldsymbol{\theta}_t = \mathbf{G} \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \]
In practice, we often make \(\mathbf{G}_t\) time invariant
\[ \boldsymbol{\theta}_t = \mathbf{G} \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \]
or assume \(\mathbf{G}_t\) is an \(m \times m\) identity matrix \(\mathbf{I}_m\)
\[ \begin{align} \boldsymbol{\theta}_t &= \mathbf{I}_m \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \\ &= \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \end{align} \]
In the latter case, the parameters follow a random walk over time
Observation model relates the covariates \(\mathbf{X}\) to the data
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t \]
State model determines how parameters “evolve” over time
\[ \boldsymbol{\theta}_t = \mathbf{G} \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \]
where
\(\mathbf{Z}_t = \mathbf{X}^{\top}_t, \mathbf{x}_t = \boldsymbol{\theta}_t, v_t = e_t, \mathbf{B} =\mathbf{G}\)
Note: DLMs include covariate effect in the observation eqn much differently than other forms of MARSS models
DLM: \(\mathbf{Z}_t\) is covariates, \(\mathbf{x}_t\) is parameters
\[ y_t = \boxed{\mathbf{Z}_t \mathbf{x}_t} + v_t \\ \]
Others: \(\mathbf{d}_t\) is covariates, \(\mathbf{D}\) is parameters
\[ y_t = \mathbf{Z}_t \mathbf{x}_t + \boxed{\mathbf{D} \mathbf{d}_t} +v_t \\ \]
The regression model is but one type
Others include:
stochastic “level” (intercept)
stochastic “growth” (trend, bias)
seasonal effects (fixed, harmonic)
\[ y_t = \alpha_t + e_t \\ \alpha_t = \alpha_{t-1} + w_t \]
\[ y_t = \alpha_t + e_t \\ \alpha_t = \alpha_{t-1} + w_t \\ \Downarrow \\ y_t = x_t + v_t \\ x_t = x_{t-1} + w_t \]
Stochastic “level” \(\alpha_t\) with deterministic “growth” \(\eta\)
\[ y_t = \alpha_t + e_t \\ \alpha_t = \alpha_{t-1} + \eta + w_t \\ \]
Stochastic “level” \(\alpha_t\) with deterministic “growth” \(\eta\)
\[ y_t = \alpha_t + e_t \\ \alpha_t = \alpha_{t-1} + \eta + w_t \\ \Downarrow \\ y_t = x_t + v_t \\ x_t = x_{t-1} + u + w_t \]
This is just a random walk with bias \(u\)
Stochastic “level” \(\alpha_t\) with stochastic “growth” \(\eta_t\)
\[ \begin{align} y_t &= \alpha_t + e_t \\ \alpha_t &= \alpha_{t-1} + \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= \eta_{t-1} + w_{\eta,t} \\ \end{align} \]
Now the “growth” term \(\eta_t\) evolves as well
\[ \begin{align} \alpha_t &= \alpha_{t-1} + \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= \eta_{t-1} + w_{\eta,t} \end{align} \]
How do we make this work in practice?
\[ \begin{align} \alpha_t &= \alpha_{t-1} + \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= \eta_{t-1} + w_{\eta,t} \\ & \Downarrow \\ \alpha_t &= 1 \alpha_{t-1} + 1 \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= 0 \alpha_{t-1} + 1 \eta_{t-1} + w_{\eta,t} \end{align} \]
Rewrite the equations with explicit coefficients on \(\alpha_{t-1}\) and \(\eta_{t-1}\)
\[ \begin{align} \alpha_t &= \alpha_{t-1} + \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= \eta_{t-1} + w_{\eta,t} \\ & \Downarrow \\ \alpha_t &= \underline{1} \alpha_{t-1} + \underline{1} \eta_{t-1} + w_{\alpha,t} \\ \eta_t &= \underline{0} \alpha_{t-1} + \underline{1} \eta_{t-1} + w_{\eta,t} \\ & \Downarrow \\ \underbrace{\begin{bmatrix} \alpha_t \\ \eta_t \end{bmatrix}}_{\boldsymbol{\theta}_t} &= \underbrace{\begin{bmatrix} \underline{1} & \underline{1} \\ \underline{0} & \underline{1} \end{bmatrix}}_{\mathbf{G}} \underbrace{\begin{bmatrix} \alpha_{t-1} \\ \eta_{t-1} \end{bmatrix}}_{\boldsymbol{\theta}_{t-1}} + \underbrace{\begin{bmatrix} w_{\alpha,t} \\ w_{\eta,t} \end{bmatrix}}_{\mathbf{w}_t} \end{align} \]
\[ \begin{align} x_{1,t} &= x_{1,t-1} + x_{2,t-1} + w_{1,t} \\ x_{2,t} &= x_{2,t-1} + w_{2,t} \\ & \Downarrow \\ x_{1,t} &= \underline{1} x_{1,t-1} + \underline{1} x_{2,t-1} + w_{1,t} \\ x_{2,t} &= \underline{0} x_{1,t-1} + \underline{1} x_{2,t-1} + w_{2,t} \\ & \Downarrow \\ \underbrace{\begin{bmatrix} x_{1,t} \\ x_{2,t} \end{bmatrix}}_{\mathbf{x}_t} &= \underbrace{\begin{bmatrix} \underline{1} & \underline{1} \\ \underline{0} & \underline{1} \end{bmatrix}}_{\mathbf{B}} \underbrace{\begin{bmatrix} x_{1,t-1} \\ x_{2,t-1} \end{bmatrix}}_{\mathbf{x}_{t-1}} + \underbrace{\begin{bmatrix} w_{1,t} \\ w_{2,t} \end{bmatrix}}_{\mathbf{w}_t} \end{align} \]
\[ \begin{align} y_t &= \alpha_t + v_t \\ & \Downarrow \\ y_t &= \underline{1} \alpha_t + \underline{0} \eta_t + v_t \end{align} \]
Again, rewrite equation with explicit coefficients on \(\alpha_t\) and \(\eta_t\)
\[ \begin{align} y_t &= \alpha_t + v_t \\ & \Downarrow \\ y_t &= \underline{1} \alpha_t + \underline{0} \eta_t + v_t \\ & \Downarrow \\ y_t &= \underbrace{\begin{bmatrix} \underline{1} & \underline{0} \end{bmatrix}}_{\mathbf{X}^{\top}_t} \underbrace{\begin{bmatrix} \alpha_t \\ \eta_t \end{bmatrix}}_{\boldsymbol{\theta}_t} + v_t \end{align} \]
\[ \begin{align} y_t &= x_t + v_t \\ & \Downarrow \\ y_t &= \underline{1} x_{1,t} + \underline{0} x_{2,t} + v_t \\ & \Downarrow \\ y_t &= \underbrace{\begin{bmatrix} \underline{1} & \underline{0} \end{bmatrix}}_{\mathbf{Z}_t} \underbrace{\begin{bmatrix} x_{1,t} \\ x_{2,t} \end{bmatrix}}_{\mathbf{x}_t} + v_t \end{align} \]
\[ \begin{align} y_t &= \alpha_t + \beta_t x_t + v_t \end{align} \]
\[ \begin{align} y_t &= \alpha_t + \beta_t x_t + v_t \\ & \Downarrow \\ y_t &= \underline{1} \alpha_t + \underline{x_t} \beta_t + v_t \end{align} \]
Rewrite the equation with explicit coefficients for \(\alpha_t\) & \(\beta_t\)
\[ \begin{align} y_t &= \alpha_t + \beta_t x_t + v_t \\ & \Downarrow \\ y_t &= \underline{1} \alpha_t + \underline{x_t} \beta_t + v_t \\ & \Downarrow \\ y_t &= \underbrace{\begin{bmatrix} \underline{1} & \underline{x_t} \end{bmatrix}}_{\mathbf{X}^{\top}_t} \underbrace{\begin{bmatrix} \alpha_t \\ \beta_t \end{bmatrix}}_{\boldsymbol{\theta}_t} + v_t \end{align} \]
\[ \begin{align} y_t &= x_{1,t} + x_{1,t} z_{2,t} + v_t \\ & \Downarrow \\ y_t &= \underline{1} x_{1,t} + \underline{z_{2,t}} x_{2,t} + v_t \\ & \Downarrow \\ y_t &= \underbrace{\begin{bmatrix} \underline{1} & \underline{z_{2,t}} \end{bmatrix}}_{\mathbf{Z}_t} \underbrace{\begin{bmatrix} x_{1,t} \\ x_{2,t} \end{bmatrix}}_{\mathbf{x}_t} + v_t \end{align} \]
\[ \begin{align} \alpha_t &= \alpha_{t-1} + w_{\alpha,t} \\ \beta_t &= \beta_{t-1} + w_{\beta,t} \\ & \Downarrow \\ \underbrace{\begin{bmatrix} \alpha_t \\ \beta_t \end{bmatrix}}_{\boldsymbol{\theta}_t} &= \underbrace{\begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \end{bmatrix}}_{\boldsymbol{\theta}_{t-1}} + \underbrace{\begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \end{bmatrix}}_{\mathbf{w}_t} \end{align} \]
\[ \begin{align} x_{1,t} &= x_{1,t-1} + w_{1,t} \\ x_{2,t} &= x_{2,t-1} + w_{2,t} \\ & \Downarrow \\ \underbrace{\begin{bmatrix} x_{1,t} \\ x_{2,t} \end{bmatrix}}_{\mathbf{x}_t} &= \underbrace{\begin{bmatrix} x_{1,t-1} \\ x_{2,t-1} \end{bmatrix}}_{\mathbf{x}_{t-1}} + \underbrace{\begin{bmatrix} w_{1,t} \\ w_{2,t} \end{bmatrix}}_{\mathbf{w}_t} \end{align} \]
\[ y_t = \alpha_t + \beta_t x_t + \gamma_{qtr} + e_t \\ \gamma_{qtr} = \begin{cases} \gamma_{1} & \text{if } qtr = 1 \\ \gamma_{2} & \text{if } qtr = 2 \\ \gamma_{3} & \text{if } qtr = 3 \\ \gamma_{4} & \text{if } qtr = 4 \end{cases} \]
\[ y_t = \alpha_t + \beta_t x_t + \gamma_{qtr} + e_t \\ \Downarrow \\ y_t = \begin{bmatrix} 1 & x_t & 1 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_{qtr} \end{bmatrix} + e_t \]
Rewrite the equation with explicit coefficients on parameters
\[ \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_{qtr} \end{bmatrix} = \begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ ? \end{bmatrix} + \begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ ? \end{bmatrix} \]
How should we model the fixed effect of \(\gamma_{qtr}\)?
\[ \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_{qtr} \end{bmatrix} = \begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ \gamma_{qtr} \end{bmatrix} + \begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ 0 \end{bmatrix} \]
We don’t want \(\gamma_{qtr}\) to evolve as a function of the previous \(t\)
\[ \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_{qtr} \end{bmatrix} = \begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ \gamma_{qtr} \end{bmatrix} + \begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ 0 \end{bmatrix} \]
OK, so how do we select the right quarterly effect?
Separate out the quarterly effects
\[ y_t = \alpha_t + \beta_t x_t + \gamma_{qtr} + e_t \\ \Downarrow \\ y_t = \alpha_t + \beta_t x_t + \gamma_1 + \gamma_2 + \gamma_3 + \gamma_4 + e_t \]
Rewrite quarterly effects in matrix notation
\[ y_t = \alpha_t + \beta_t x_t + \gamma_1 + \gamma_2 + \gamma_3 + \gamma_4 + e_t \\ \Downarrow \\ y_t = \begin{bmatrix} 1 & x_t & 1 & 1 & 1 & 1 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix} \]
But how do we select only the current quarter?
We could set some values in \(\mathbf{x}_t\) to 0 (\(qtr\) = 1)
\[ y_t = \begin{bmatrix} 1 & x_t & 1 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix} \\ \Downarrow \\ y_t = \alpha_t + \beta_t x_t + \gamma_1 + e_t \\ \]
We could set some values in \(\mathbf{x}_t\) to 0 (\(qtr\) = 2)
\[ y_t = \begin{bmatrix} 1 & x_t & 0 & 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix} \\ \Downarrow \\ y_t = \alpha_t + \beta_t x_t + \gamma_2 + e_t \\ \]
But how would we set the correct 0/1 values?
\[ \mathbf{X}^{\top}_t = \begin{bmatrix} 1 & x_t & ? & ? & ? & ? \end{bmatrix} \]
We could instead reorder the \(\gamma_i\) within \(\boldsymbol{\theta}_t\) (\(qtr\) = 1)
\[ y_t = \begin{bmatrix} 1 & x_t & 1 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix} \\ \Downarrow \\ y_t = \alpha_t + \beta_t x_t + \gamma_1 + e_t \\ \]
We could instead reorder the \(\gamma_i\) within \(\boldsymbol{\theta}_t\) (\(qtr\) = 2)
\[ y_t = \begin{bmatrix} 1 & x_t & 1 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \alpha_t \\ \beta_t \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_1 \end{bmatrix} \\ \Downarrow \\ y_t = \alpha_t + \beta_t x_t + \gamma_2 + e_t \\ \]
But how would we shift the \(\gamma_i\) within \(\boldsymbol{\theta}_t\)?
\[ \boldsymbol{\theta}_t = \begin{bmatrix} \alpha_t \\ \beta_t \\ ? \\ ? \\ ? \\ ? \end{bmatrix} \]
We can use a non-diagonal submatrix in the lower right of \(\mathbf{G}\) to get the correct quarter effect
\[ \mathbf{G} = \left[ \begin{array}{cc|cccc} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right] \]
\[ \underbrace{\begin{bmatrix} \alpha_t \\ \beta_t \\ \hline \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix}}_{\boldsymbol{\theta}_t} = \underbrace{ \left[ \begin{array}{cc|cccc} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right]}_{\mathbf{G}} \underbrace{\begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ \hline \gamma_4 \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \end{bmatrix}}_{\boldsymbol{\theta}_{t-1}} + \underbrace{\begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ \hline 0 \\0 \\ 0 \\ 0 \end{bmatrix}}_{\mathbf{w}_t} \]
\[ \underbrace{\begin{bmatrix} \alpha_t \\ \beta_t \\ \hline \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_1 \end{bmatrix}}_{\boldsymbol{\theta}_t} = \underbrace{ \left[ \begin{array}{cc|cccc} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right]}_{\mathbf{G}} \underbrace{\begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ \hline \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \end{bmatrix}}_{\boldsymbol{\theta}_{t-1}} + \underbrace{\begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ \hline 0 \\0 \\ 0 \\ 0 \end{bmatrix}}_{\mathbf{w}_t} \]
\[ \underbrace{\begin{bmatrix} \alpha_t \\ \beta_t \\ \hline \gamma_3 \\ \gamma_4 \\ \gamma_1 \\ \gamma_2 \end{bmatrix}}_{\boldsymbol{\theta}_t} = \underbrace{ \left[ \begin{array}{cc|cccc} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ \hline 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{array} \right]}_{\mathbf{G}} \underbrace{\begin{bmatrix} \alpha_{t-1} \\ \beta_{t-1} \\ \hline \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_1 \end{bmatrix}}_{\boldsymbol{\theta}_{t-1}} + \underbrace{\begin{bmatrix} w_{\alpha,t} \\ w_{\beta,t} \\ \hline 0 \\0 \\ 0 \\ 0 \end{bmatrix}}_{\mathbf{w}_t} \]
DLMs are often used in a forecasting context where we want a prediction for time \(t\) based on the data up through time \(t-1\)
get estimate of today’s parameters from yesterday’s
make prediction based on today’s parameters & covariates
get observation for today
update parameters and repeat
Step 1: Define the parameters at time \(t = 0\)
\[ \boldsymbol{\theta}_0 | y_0 = \boldsymbol{\pi} + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{\Lambda}) \\ \]
Step 1: Define the parameters at time \(t = 0\)
\[ \boldsymbol{\theta}_0 | y_0 = \boldsymbol{\pi} + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{\Lambda}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_0) = \boldsymbol{\pi} \]
Step 1: Define the parameters at time \(t = 0\)
\[ \boldsymbol{\theta}_0 | y_0 = \boldsymbol{\pi} + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{\Lambda}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_0) = \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(\boldsymbol{\theta}_0) = \text{Var}(\boldsymbol{\pi}) + \text{Var}(\mathbf{w}_1) \\ \text{Var}(\boldsymbol{\theta}_0) = \mathbf{0} + \mathbf{\Lambda} \\ \text{Var}(\boldsymbol{\theta}_0) = \mathbf{\Lambda} \]
Step 1: Define the parameters at time \(t = 0\)
\[ \boldsymbol{\theta}_0 | y_0 = \boldsymbol{\pi} + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{\Lambda}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_0) = \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(\boldsymbol{\theta}_0) = \mathbf{\Lambda} \\ \Downarrow \\ \boldsymbol{\theta}_0 | y_0 \sim \text{MVN}(\boldsymbol{\pi}, \mathbf{\Lambda}) \]
Step 2: Define the parameters at time \(t = 1\)
\[ \boldsymbol{\theta}_1 | y_0 = \mathbf{G} \boldsymbol{\theta}_0 + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \\ \]
Step 2: Define the parameters at time \(t = 1\)
\[ \boldsymbol{\theta}_1 | y_0 = \mathbf{G} \boldsymbol{\theta}_0 + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_1) = \mathbf{G} \text{E}(\boldsymbol{\theta}_0) \\ \text{E}(\boldsymbol{\theta}_1) = \mathbf{G} \boldsymbol{\pi} \]
Step 2: Define the parameters at time \(t = 1\)
\[ \boldsymbol{\theta}_1 | y_0 = \mathbf{G} \boldsymbol{\theta}_0 + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_1) = \mathbf{G} \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(\boldsymbol{\theta}_1) = \mathbf{G} \text{Var}(\boldsymbol{\theta}_0) \mathbf{G}^{\top} + \text{Var}(\mathbf{w}_1) \\ \text{Var}(\boldsymbol{\theta}_1) = \mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q} \]
Step 2: Define the parameters at time \(t = 1\)
\[ \boldsymbol{\theta}_1 | y_0 = \mathbf{G} \boldsymbol{\theta}_0 + \mathbf{w}_1 ~ \text{with} ~ \mathbf{w}_1 \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \\ \Downarrow \\ \text{E}(\boldsymbol{\theta}_1) = \mathbf{G} \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(\boldsymbol{\theta}_1) = \mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q} \\ \Downarrow \\ \boldsymbol{\theta}_1 | y_0 \sim \text{MVN}(\mathbf{G} \boldsymbol{\pi}, ~ \mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}) \]
Step 3: Make a forecast of \(y_t\) at time \(t = 1\)
\[ y_1 | y_0 = \mathbf{X}^{\top}_1 \boldsymbol{\theta}_1 + e_1 ~ \text{with} ~ e_1 \sim \text{N}(0, R) \\ \Downarrow \\ \text{E}(y_1) = \mathbf{X}^{\top}_1 \text{E}(\boldsymbol{\theta}_1) \\ \text{E}(y_1) = \mathbf{X}^{\top}_1 \mathbf{G} \boldsymbol{\pi} \]
Step 3: Make a forecast of \(y_t\) at time \(t = 1\)
\[ y_1 | y_0 = \mathbf{X}^{\top}_1 \boldsymbol{\theta}_1 + e_1 ~ \text{with} ~ e_1 \sim \text{N}(0, R) \\ \Downarrow \\ \text{E}(y_1) = \mathbf{X}^{\top}_1 \mathbf{G} \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(y_1) = \mathbf{X}^{\top}_1 \text{Var}(\boldsymbol{\theta}_1) \mathbf{X}_1 + \text{Var}(e_1) \\ \text{Var}(y_1) = \mathbf{X}^{\top}_1 [\mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}] \mathbf{X}_1 + R \]
Step 3: Make a forecast of \(y_t\) at time \(t = 1\)
\[ y_1 | y_0 = \mathbf{X}^{\top}_1 \boldsymbol{\theta}_1 + e_1 ~ \text{with} ~ e_1 \sim \text{N}(0, R) \\ \Downarrow \\ \text{E}(y_1) = \mathbf{X}^{\top}_1 \mathbf{G} \boldsymbol{\pi} \\ \text{and} \\ \text{Var}(y_1) = \mathbf{X}^{\top}_1 [\mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}] \mathbf{X}_1 + R \\ \Downarrow \\ y_1 | y_0 \sim \text{N}(\mathbf{X}^{\top}_1 [\mathbf{G} \boldsymbol{\pi}], ~ \mathbf{X}^{\top}_1 [\mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}] \mathbf{X}_1 + R) \]
Putting it all together
\[ \begin{align} \boldsymbol{\theta}_0 | y_0 & \sim \text{MVN}(\boldsymbol{\pi}, \mathbf{\Lambda}) \\ \boldsymbol{\theta}_t | y_{t-1} & \sim \text{MVN}(\mathbf{G} \boldsymbol{\pi}, \mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}) \\ y_t | y_{t-1} & \sim \text{N}(\mathbf{X}^{\top}_t [\mathbf{G} \boldsymbol{\pi}], \mathbf{X}^{\top}_t [\mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}] \mathbf{X}_t + R) \end{align} \]
Putting it all together
\[ \begin{align} \boldsymbol{\theta}_0 | y_0 & \sim \text{MVN}(\boldsymbol{\pi}, \mathbf{\Lambda}) \\ \boldsymbol{\theta}_t | y_{t-1} & \sim \text{MVN}(\mathbf{G} \boldsymbol{\pi}, \mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}) \\ y_t | y_{t-1} & \sim \text{N}(\mathbf{X}^{\top}_t [\mathbf{G} \boldsymbol{\pi}], \mathbf{X}^{\top}_t [\mathbf{G} \mathbf{\Lambda} \mathbf{G}^{\top} + \mathbf{Q}] \mathbf{X}_t + R) \end{align} \]
Using MARSS()
will make this easy to do
Just as with other models, we’d like to know if our fitted DLM meets its underlying assumptions
We can calcuate the forecast error \(e_t\) as
\[ e_t = y_t - \hat{y}_t \] and check if
\[ \begin{align} (1) & ~~ e_t \sim \text{N}(0, \sigma) \\ (2) & ~~ \text{Cov}(e_t, e_{t-1}) = 0 \end{align} \]
with a QQ-plot (1) and an ACF (2)
We can expand our DLM to have a multivariate response
\[ \begin{matrix} \mathbf{y}_t = \mathbf{Z} \alpha_t + \mathbf{v}_t & ~~ \mathbf{y}_t ~ \text{is } n \times 1 \\ \alpha_t = \alpha_{t-1} + w_t & ~~ \alpha_t ~ \text{is } 1 \times 1 \end{matrix} \]
with
\[ \mathbf{Z} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} \]
\[ \begin{matrix} \mathbf{y}_t = \mathbf{Z} x_t + \mathbf{v}_t & ~~ \mathbf{y}_t ~ \text{is } n \times 1 \\ x_t = x_{t-1} + w_t & ~~ x_t ~ \text{is } 1 \times 1 \end{matrix} \]
with
\[ \mathbf{Z} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} \]
\[ \begin{matrix} \mathbf{y}_t = \mathbf{Z} \boldsymbol{\alpha}_t + \mathbf{v}_t & ~~ \mathbf{y}_t ~ \text{is } n \times 1 \\ \boldsymbol{\alpha}_t = \boldsymbol{\alpha}_{t-1} + \mathbf{w}_t & ~~ \boldsymbol{\alpha}_t ~ \text{is } n \times 1 \\ \end{matrix} \]
with
\[ \mathbf{Z} = \mathbf{I}_n = \begin{bmatrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \ddots & 0 \\ \vdots & \ddots & \ddots & 0 \\ 0 & \dots & 0 & 1 \\ \end{bmatrix} \]
\[ \begin{matrix} \mathbf{y}_t = \mathbf{Z} \mathbf{x}_t + \mathbf{v}_t & ~~ \mathbf{y}_t ~ \text{is } n \times 1 \\ \mathbf{x}_t = \mathbf{x}_{t-1} + \mathbf{w}_t & ~~ \mathbf{x}_t ~ \text{is } n \times 1 \\ \end{matrix} \]
with
\[ \mathbf{Z} = \mathbf{I}_n = \begin{bmatrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \ddots & 0 \\ \vdots & \ddots & \ddots & 0 \\ 0 & \dots & 0 & 1 \\ \end{bmatrix} \]
Our univariate model
\[ y_t = \mathbf{X}^{\top}_t \boldsymbol{\theta}_t + e_t ~ \text{with} ~ e_t \sim \text{N}(0,R) \]
becomes
\[ \mathbf{y}_t = (\mathbf{X}^{\top}_t \otimes \mathbf{I}_n) \boldsymbol{\theta}_t + \mathbf{e}_t ~ \text{with} ~ \mathbf{e}_t \sim \text{MVN}(\mathbf{0},\mathbf{R}) \]
where \(\otimes\) is the Kronecker product
If \(\mathbf{A}\) is an \(m \times n\) matrix and \(\mathbf{B}\) is a \(p \times q\) matrix
then \(\mathbf{A} \otimes \mathbf{B}\) will be an \(mp \times nq\) matrix
\[ \mathbf{A} \otimes \mathbf{B} = \begin{bmatrix} a_{11} \mathbf{B} & \dots & a_{1n} \mathbf{B} \\ \vdots & \ddots & \vdots \\ a_{m1} \mathbf{B} & \dots & a_{mn} \mathbf{B} \\ \end{bmatrix} \]
For example
\[ \mathbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} ~ \text{and} ~ \mathbf{B} = \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix} \]
so
\[ \mathbf{A} \otimes \mathbf{B} = \begin{bmatrix} 1 \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix} & 2 \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix} \\ 3 \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix} & 4 \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix} \end{bmatrix} = \begin{bmatrix} 2 & 4 & 4 & 8\\ 6 & 8 & 12 & 16\\ 6 & 12 & 8 & 16\\ 18 & 24 & 24 & 32 \end{bmatrix} \]
\[ \begin{align} \mathbf{y}_t &= (\mathbf{X}^{\top}_t \otimes \mathbf{I}_n) \boldsymbol{\theta}_t + \mathbf{e}_t \\ &\Downarrow \\ \begin{bmatrix} y_{1,t} \\ y_{2,t} \end{bmatrix} &= \left( \begin{bmatrix} 1 & x_t \end{bmatrix} \otimes \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \right) \begin{bmatrix} \alpha_{1,t} \\ \alpha_{2,t} \\ \beta_{1,t} \\ \beta_{2,t} \end{bmatrix} + \begin{bmatrix} e_{1,t} \\ e_{2,t} \end{bmatrix} \end{align} \]
\[ \begin{bmatrix} y_{1,t} \\ y_{2,t} \end{bmatrix} = \left( \begin{bmatrix} 1 & x_t \end{bmatrix} \otimes \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \right) \begin{bmatrix} \alpha_{1,t} \\ \alpha_{2,t} \\ \beta_{1,t} \\ \beta_{2,t} \end{bmatrix} + \begin{bmatrix} e_{1,t} \\ e_{2,t} \end{bmatrix} \\ \Downarrow \\ \begin{bmatrix} y_{1,t} \\ y_{2,t} \end{bmatrix} = \begin{bmatrix} 1 & 0 & x_t & 0 \\ 0 & 1 & 0 & x_t \end{bmatrix} \begin{bmatrix} \alpha_{1,t} \\ \alpha_{2,t} \\ \beta_{1,t} \\ \beta_{2,t} \end{bmatrix} + \begin{bmatrix} e_{1,t} \\ e_{2,t} \end{bmatrix} \]
\[ \mathbf{R} \stackrel{?}{=} \begin{bmatrix} \sigma & 0 & 0 & 0 \\ 0 & \sigma & 0 & 0 \\ 0 & 0 & \sigma & 0 \\ 0 & 0 & 0 & \sigma \end{bmatrix} ~\text{or}~~ \mathbf{R} \stackrel{?}{=} \begin{bmatrix} \sigma_1 & 0 & 0 & 0 \\ 0 & \sigma_2 & 0 & 0 \\ 0 & 0 & \sigma_3 & 0 \\ 0 & 0 & 0 & \sigma_4 \end{bmatrix} \]
\[ \mathbf{R} \stackrel{?}{=} \begin{bmatrix} \sigma & \gamma & \gamma & \gamma \\ \gamma & \sigma & \gamma & \gamma \\ \gamma & \gamma & \sigma & \gamma \\ \gamma & \gamma & \gamma & \sigma \end{bmatrix} ~\text{or}~~ \mathbf{R} \stackrel{?}{=} \begin{bmatrix} \sigma_1 & 0 & 0 & 0 \\ 0 & \sigma_2 & 0 & \gamma_{2,4} \\ 0 & 0 & \sigma_3 & 0 \\ 0 & \gamma_{2,4} & 0 & \sigma_4 \end{bmatrix} \]
\[ \boldsymbol{\theta}_t = \mathbf{G} \boldsymbol{\theta}_{t-1} + \mathbf{w}_t ~ \text{with} ~ \mathbf{w}_t \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \]
becomes
\[ \boldsymbol{\theta}_t = \left( \mathbf{G} \otimes \mathbf{I}_n \right) \boldsymbol{\theta}_{t-1} + \mathbf{w}_t ~ \text{with} ~ \mathbf{w}_t \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \]
If we have 2 regression parameters and \(n = 2\), then
\[ \boldsymbol{\theta}_t = \begin{bmatrix} \alpha_{1,t} \\ \alpha_{2,t} \\ \beta_{1,t} \\ \beta_{2,t} \\ \end{bmatrix} ~~ \text{and} ~~ \mathbf{G} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \mathbf{I}_2 \]
\[ \boldsymbol{\theta}_t = \left( \mathbf{G} \otimes \mathbf{I}_n \right) \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \\ \Downarrow \\ \boldsymbol{\theta}_t = \left( \mathbf{I}_2 \otimes \mathbf{I}_2 \right) \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \]
\[ \begin{align} \mathbf{I}_2 \otimes \mathbf{I}_2 &= \begin{bmatrix} 1 \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} & 0 \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \\ 0 \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} & 1 \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \\ \end{bmatrix} \\ &= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{align} \]
\[ \begin{align} \boldsymbol{\theta}_t &= \left( \mathbf{G} \otimes \mathbf{I}_n \right) \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \\ &\Downarrow \\ \begin{bmatrix} \alpha_{1,t} \\ \alpha_{2,t} \\ \beta_{1,t} \\ \beta_{2,t} \\ \end{bmatrix} &= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \alpha_{1,t-1} \\ \alpha_{2,t-1} \\ \beta_{1,t-1} \\ \beta_{2,t-1} \\ \end{bmatrix} + \begin{bmatrix} w_{\alpha_1,t} \\ w_{\alpha_2,t} \\ w_{\beta_1,t} \\ w_{\beta_2,t} \end{bmatrix} \\ &\Downarrow \\ \boldsymbol{\theta}_t &= \boldsymbol{\theta}_{t-1} + \mathbf{w}_t \end{align} \]
\[ \boldsymbol{\theta}_t = \boldsymbol{\theta}_{t-1} + \mathbf{w}_t ~ \text{with} ~ \mathbf{w}_t \sim \text{MVN}(\mathbf{0}, \underline{\mathbf{Q}}) \]
What form should we choose for \(\mathbf{Q}\)?
\[ \begin{bmatrix} \boldsymbol{\alpha}_t \\ \boldsymbol{\beta}_t \end{bmatrix} \sim ~ \text{MVN} \left( \begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix} , \begin{bmatrix} \mathbf{Q}_\alpha & \mathbf{0} \\ \mathbf{0} & \mathbf{Q}_\beta \end{bmatrix} \right) \]
\[ \mathbf{Q}_{(\cdot)} = \begin{bmatrix} q_{(\cdot)} & 0 & \dots & 0 \\ 0 & q_{(\cdot)} & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & q_{(\cdot)} \end{bmatrix} \]
Diagonal and equal (IID)
\[ \begin{bmatrix} \boldsymbol{\alpha}_t \\ \boldsymbol{\beta}_t \end{bmatrix} \sim ~ \text{MVN} \left( \begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix} , \begin{bmatrix} \mathbf{Q}_\alpha & \mathbf{0} \\ \mathbf{0} & \mathbf{Q}_\beta \end{bmatrix} \right) \]
\[ \mathbf{Q}_{(\cdot)} = \begin{bmatrix} q_{(\cdot)1} & 0 & \dots & 0 \\ 0 & q_{(\cdot)2} & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & q_{(\cdot)n} \end{bmatrix} \]
Diagonal and unequal
\[ \begin{bmatrix} \boldsymbol{\alpha}_t \\ \boldsymbol{\beta}_t \end{bmatrix} \sim ~ \text{MVN} \left( \begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix} , \begin{bmatrix} \mathbf{Q}_\alpha & \mathbf{0} \\ \mathbf{0} & \mathbf{Q}_\beta \end{bmatrix} \right) \]
\[ \mathbf{Q}_{(\cdot)} = \begin{bmatrix} q_{(\cdot)1,1} & q_{(\cdot)1,2} & \dots & q_{(\cdot)1,n} \\ q_{(\cdot)2,1} & q_{(\cdot)2,2} & \dots & q_{(\cdot)2,n} \\ \vdots & \vdots & \ddots & \vdots \\ q_{(\cdot)n,1} & q_{(\cdot)n,2} & \dots & q_{(\cdot)n,n} \end{bmatrix} \]
Unconstrained
\[ \begin{bmatrix} \boldsymbol{\alpha}_t \\ \boldsymbol{\beta}_t \end{bmatrix} \sim ~ \text{MVN} \left( \begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix} , \begin{bmatrix} \mathbf{Q}_\alpha & \mathbf{0} \\ \mathbf{0} & \mathbf{Q}_\beta \end{bmatrix} \right) \]
In practice, keep \(\mathbf{Q}\) as simple as possible
Univariate response
Stochastic level & growth
Dynamic Regression
Dynamic Regression with fixed season
Forecasting with a DLM
Model diagnostics
Multivariate response