9.1 Overview
We begin our description of DLMs with a static regression model, wherein the \(i^{th}\) observation (response variable) is a linear function of an intercept, predictor variable(s), and a random error term. For example, if we had one predictor variable (\(f\)), we could write the model as \[\begin{equation} \tag{9.1} y_i = \alpha + \beta f_i + v_i, \end{equation}\] where the \(\alpha\) is the intercept, \(\beta\) is the regression slope, \(f_i\) is the predictor variable matched to the \(i^{th}\) observation (\(y_i\)), and \(v_i \sim \text{N}(0,r)\). It is important to note here that there is no implicit ordering of the index \(i\). That is, we could shuffle any/all of the \((y_i, f_i)\) pairs in our dataset with no effect on our ability to estimate the model parameters.
We can write Equation (9.1) using matrix notation, as
\[\begin{align} \tag{9.2} y_i &= \begin{bmatrix}1&f_i\end{bmatrix} \begin{bmatrix}\alpha\\ \beta\end{bmatrix} + v_i \nonumber\\ &= \mathbf{F}_i^{\top}\boldsymbol{\theta} + v_i, \end{align}\]
where \(\mathbf{F}_i^{\top} = \begin{bmatrix}1&f_i\end{bmatrix}\) and \(\boldsymbol{\theta} = \begin{bmatrix}\alpha\\ \beta\end{bmatrix}\).
In a DLM, however, the regression parameters are dynamic in that they “evolve” over time. For a single observation at time \(t\), we can write
\[\begin{equation} (\#eq:dlm-dlm_1) y_t = \mathbf{F}_{t}^{\top}\boldsymbol{\theta}_t + v_t, \end{equation}\]
where \(\mathbf{F}_t\) is a column vector of predictor variables (covariates) at time \(t\), \(\boldsymbol{\theta}_t\) is a column vector of regression parameters at time \(t\) and \(v_{t}\sim\,\text{N}(0,r)\). This formulation presents two features that distinguish it from Equation (9.2). First, the observed data are explicitly time ordered (i.e., \(\mathbf{y}=\lbrace{y_1,y_2,y_3,\dots,y_T}\rbrace\)), which means we expect them to contain implicit information. Second, the relationship between the observed datum and the predictor variables are unique at every time \(t\) (i.e., \(\boldsymbol{\theta}=\lbrace{\boldsymbol{\theta}_1,\boldsymbol{\theta}_2,\boldsymbol{\theta}_3,\dots,\boldsymbol{\theta}_T}\rbrace\)).
However, closer examination of Equation @ref(eq:dlm-dlm_1) reveals an apparent complication for parameter estimation. With only one datum at each time step \(t\), we could, at best, estimate only one regression parameter, and even then, the 1:1 correspondence between data and parameters would preclude any estimation of parameter uncertainty. To address this shortcoming, we return to the time ordering of model parameters. Rather than assume the regression parameters are independent from one time step to another, we instead model them as an autoregressive process where
\[\begin{equation} \tag{9.3} \boldsymbol{\theta}_t = \mathbf{G}_t\boldsymbol{\theta}_{t-1} + \mathbf{w}_t, \end{equation}\]
\(\mathbf{G}_t\) is the parameter “evolution” matrix, and \(\mathbf{w}_t\) is a vector of process errors, such that \(\mathbf{w}_t \sim \,\text{MVN}(\mathbf{0},\mathbf{Q})\). The elements of \(\mathbf{G}_t\) may be known and fixed a priori, or unknown and estimated from the data. Although we could allow \(\mathbf{G}_t\) to be time-varying, we will typically assume that it is time invariant or assume \(\mathbf{G}_t\) is an \(m \times m\) identity matrix \(\mathbf{I}_m\).
The idea is that the evolution matrix \(\mathbf{G}_t\) deterministically maps the parameter space from one time step to the next, so the parameters at time \(t\) are temporally related to those before and after. However, the process is corrupted by stochastic error, which amounts to a degradation of information over time. If the diagonal elements of \(\mathbf{Q}\) are relatively large, then the parameters can vary widely from \(t\) to \(t+1\). If \(\mathbf{Q} = \mathbf{0}\), then \(\boldsymbol{\theta}_1=\boldsymbol{\theta}_2=\boldsymbol{\theta}_T\) and we are back to the static model in Equation (9.1).