2.9 Solving for the parameters for Form 2*

Solving for the parameters when the model is written in Form 2 is not straight-forward. We could re-write the model in Form 1, or another approach is to use Kronecker products and permutation matrices.

To solve for $\alpha$ and $\beta$, we need our parameters in a column matrix like so $\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]$. We start by moving the intercept matrix, $\mathbf{a}$ into $\mathbf{Z}$. \[\begin{equation} \tag{2.24} \begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\\stack.loss_4\end{bmatrix} = \begin{bmatrix} \alpha&\beta&0&0&0\\ \alpha&0&\beta&0&0\\ \alpha&0&0&\beta&0\\ \alpha&0&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\\air_4\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix} = \mathbf{Z}\mathbf{x} + \mathbf{e}. \end{equation}\]

Then we rewrite $\mathbf{Z}\mathbf{x}$ in Equation (2.24) in `vec’ form: if $\mathbf{Z}$ is a $n \times m$ matrix and $\mathbf{x}$ is a matrix with 1 column and $m$ rows, then $\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})$. The symbol $\otimes$ means Kronecker product and just ignore it since you’ll never see it again in our course (or google ‘kronecker product’ if you are curious). The “vec” of a matrix is that matrix rearranged as a single column: \[\begin{equation*} \,\text{vec} \begin{bmatrix} 1&2\\ 3&4 \end{bmatrix} = \begin{bmatrix} 1\\3\\2\\4 \end{bmatrix} \end{equation*}\] Notice how you just take each column one by one and stack them under each other. In R, the vec is

A = matrix(1:6, nrow = 2, byrow = TRUE)
vecA = matrix(A, ncol = 1)

$\mathbf{I}_n$ is a $n \times n$ identity matrix, a diagonal matrix with all 0s on the off-diagonals and all 1s on the diagonal. In R, this is simply diag(n).

To show how we solve for $\alpha$ and $\beta$, let’s use an example with only 3 data points so Equation (2.24) becomes: \[\begin{equation} \tag{2.25} \begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\end{bmatrix} = \begin{bmatrix} \alpha&\beta&0&0\\ \alpha&0&\beta&0\\ \alpha&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\end{bmatrix} \end{equation}\] Using $\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})$, this means \[\begin{equation} \begin{bmatrix} \alpha&\beta&0&0\\ \alpha&0&\beta&0\\ \alpha&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\end{bmatrix} =\big(\begin{bmatrix}1&air_1&air_2& air_3\end{bmatrix} \otimes \begin{bmatrix}1&0&0\\ 0&1&0 \\ 0&0&1 \end{bmatrix} \bigr) \begin{bmatrix} \alpha\\ \alpha\\ \alpha\\ \beta\\ 0\\ 0\\ 0\\ \beta\\ 0\\ 0\\ 0\\ \beta \end{bmatrix} \end{equation}\] We need to rewrite the $\,\text{vec}(\mathbf{Z})$ as a `permutation’ matrix times $\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]$: \[\begin{equation} \begin{bmatrix} \alpha\\ \alpha\\ \alpha\\ \beta\\ 0\\ 0\\ 0\\ \beta\\ 0\\ 0\\ 0\\ \beta \end{bmatrix} = \begin{bmatrix} 1&0\\ 1&0\\ 1&0\\ 0&1\\ 0&0\\ 0&0\\ 0&0\\ 0&1\\ 0&0\\ 0&0\\ 0&0\\ 0&1\\ \end{bmatrix} \begin{bmatrix} \alpha\\ \beta \end{bmatrix} = \mathbf{P}\mathbf{p} \end{equation}\] where $\mathbf{P}$ is the permutation matrix and $\mathbf{p}=\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]$. Thus, \[\begin{equation} \tag{2.26} \mathbf{y}=\mathbf{Z}\mathbf{x}+\mathbf{e} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\begin{bmatrix}\alpha\\ \beta\end{bmatrix} = \mathbf{M}\mathbf{p} + \mathbf{e} \end{equation}\] where $\mathbf{M}=(\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}$. We can solve for $\mathbf{p}$, the parameters, using \[(\mathbf{M}^\top\mathbf{M})^{-1}\mathbf{M}^\top\mathbf{y}\] as before.

2.9.0.1 Code to solve for parameters in Form 2

In the homework, you will use the R code in this section to solve for the parameters in Form 2.

#make your y and x matrices
y=matrix(dat$stack.loss, ncol=1)
x=matrix(c(1,dat$Air.Flow),ncol=1)
#make the Z matrix
n=nrow(dat) #number of rows in our data file
k=1
#Z has n rows and 1 col for intercept, and n cols for the n air data points
#a list matrix allows us to combine "characters" and numbers
Z=matrix(list(0),n,k*n+1) 
Z[,1]="alpha"
diag(Z[1:n,1+1:n])="beta" 
#this function creates that permutation matrix for you
P=MARSS:::convert.model.mat(Z)$free[,,1]
M=kronecker(t(x),diag(n))%*%P
solve(t(M)%*%M)%*%t(M)%*%y

             [,1]
alpha -11.6159170
beta    0.6412918

coef(lm(dat$stack.loss ~ dat$Air.Flow))

 (Intercept) dat$Air.Flow 
 -11.6159170    0.6412918

Go through this code line by line at the R command line. Look at Z. It is a list matrix that allows you to combine numbers (the 0s) with character string (names of parameters). Notice that class(Z[1,3])="numeric" while class(Z[1,2])="character". This is important. 0 in R is a number while "0" would be a character (the name of a parameter). Look at the permutation matrix P. Try MARSS:::convert.model.mat(Z)$free and see that it returns a 3D matrix, which is why the [,,1] appears (to get us a 2D matrix). To use more data points, you can redefine dat to say dat=stackloss to use all 21 data points.

Here’s another example. Rewrite the model with multiple intercepts (Equation (2.14) ) as \[\begin{equation} \tag{2.27} \begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix} = \begin{bmatrix} \alpha_n&\beta&0&0&0\\ \alpha_s&0&\beta&0&0\\ \alpha_n&0&0&\beta&0\\ \alpha_s&0&0&0&\beta \end{bmatrix}\begin{bmatrix}1\\air_1\\air_2\\air_3\\air_4\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{a}+\mathbf{e} \end{equation}\] To estimate the parameters, we need to be able to write a list matrix that looks like $\mathbf{Z}$ in Equation (2.27). We can use the same code as above with $\mathbf{Z}$ changed to look like that in Equation (2.27).

y = matrix(dat$stack.loss, ncol = 1)
x = matrix(c(1, dat$Air.Flow), ncol = 1)
n = nrow(dat)
k = 1
# list matrix allows us to combine numbers and character
# strings
Z = matrix(list(0), n, k * n + 1)
Z[seq(1, n, 2), 1] = "alphanorth"
Z[seq(2, n, 2), 1] = "alphasouth"
diag(Z[1:n, 1 + 1:n]) = "beta"
P = MARSS:::convert.model.mat(Z)$free[, , 1]
M = kronecker(t(x), diag(n)) %*% P
solve(t(M) %*% M) %*% t(M) %*% y

                 [,1]
alphanorth -2.0257880
alphasouth -5.5429799
beta        0.5358166

Similarly to estimate the parameters for Equation (2.18), we change the $\beta$’s in our $\mathbf{Z}$ list matrix to have owner designations:

Z = matrix(list(0), n, k * n + 1)
Z[seq(1, n, 2), 1] = "alphanorth"
Z[seq(2, n, 2), 1] = "alphasouth"
diag(Z[1:n, 1 + 1:n]) = rep(c("beta.s", "beta.a"), n)[1:n]
P = MARSS:::convert.model.mat(Z)$free[, , 1]
M = kronecker(t(x), diag(n)) %*% P
solve(t(M) %*% M) %*% t(M) %*% y

            [,1]
alphanorth -38.0
alphasouth  -3.0
beta.s       1.0
beta.a       0.5

The parameters estimates are the same as with the model in Form 1, though $\beta$’s are given in reversed order simply due to the way convert.model.mat() is ordering the columns in Form 2’s $\mathbf{Z}$.