2.9 Solving for the parameters for Form 2*

Solving for the parameters when the model is written in Form 2 is not straight-forward. We could re-write the model in Form 1, or another approach is to use Kronecker products and permutation matrices.

To solve for \(\alpha\) and \(\beta\), we need our parameters in a column matrix like so \(\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\). We start by moving the intercept matrix, \(\mathbf{a}\) into \(\mathbf{Z}\). \[\begin{equation} \tag{2.24} \begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\\stack.loss_4\end{bmatrix} = \begin{bmatrix} \alpha&\beta&0&0&0\\ \alpha&0&\beta&0&0\\ \alpha&0&0&\beta&0\\ \alpha&0&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\\air_4\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix} = \mathbf{Z}\mathbf{x} + \mathbf{e}. \end{equation}\]

Then we rewrite \(\mathbf{Z}\mathbf{x}\) in Equation (2.24) in `vec’ form: if \(\mathbf{Z}\) is a \(n \times m\) matrix and \(\mathbf{x}\) is a matrix with 1 column and \(m\) rows, then \(\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})\). The symbol \(\otimes\) means Kronecker product and just ignore it since you’ll never see it again in our course (or google ‘kronecker product’ if you are curious). The “vec” of a matrix is that matrix rearranged as a single column: \[\begin{equation*} \,\text{vec} \begin{bmatrix} 1&2\\ 3&4 \end{bmatrix} = \begin{bmatrix} 1\\3\\2\\4 \end{bmatrix} \end{equation*}\] Notice how you just take each column one by one and stack them under each other. In R, the vec is

A = matrix(1:6, nrow = 2, byrow = TRUE)
vecA = matrix(A, ncol = 1)

\(\mathbf{I}_n\) is a \(n \times n\) identity matrix, a diagonal matrix with all 0s on the off-diagonals and all 1s on the diagonal. In R, this is simply diag(n).

To show how we solve for \(\alpha\) and \(\beta\), let’s use an example with only 3 data points so Equation (2.24) becomes: \[\begin{equation} \tag{2.25} \begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\end{bmatrix} = \begin{bmatrix} \alpha&\beta&0&0\\ \alpha&0&\beta&0\\ \alpha&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\end{bmatrix} \end{equation}\] Using \(\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})\), this means \[\begin{equation} \begin{bmatrix} \alpha&\beta&0&0\\ \alpha&0&\beta&0\\ \alpha&0&0&\beta \end{bmatrix} \begin{bmatrix}1\\air_1\\air_2\\air_3\end{bmatrix} =\big(\begin{bmatrix}1&air_1&air_2& air_3\end{bmatrix} \otimes \begin{bmatrix}1&0&0\\ 0&1&0 \\ 0&0&1 \end{bmatrix} \bigr) \begin{bmatrix} \alpha\\ \alpha\\ \alpha\\ \beta\\ 0\\ 0\\ 0\\ \beta\\ 0\\ 0\\ 0\\ \beta \end{bmatrix} \end{equation}\] We need to rewrite the \(\,\text{vec}(\mathbf{Z})\) as a `permutation’ matrix times \(\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\): \[\begin{equation} \begin{bmatrix} \alpha\\ \alpha\\ \alpha\\ \beta\\ 0\\ 0\\ 0\\ \beta\\ 0\\ 0\\ 0\\ \beta \end{bmatrix} = \begin{bmatrix} 1&0\\ 1&0\\ 1&0\\ 0&1\\ 0&0\\ 0&0\\ 0&0\\ 0&1\\ 0&0\\ 0&0\\ 0&0\\ 0&1\\ \end{bmatrix} \begin{bmatrix} \alpha\\ \beta \end{bmatrix} = \mathbf{P}\mathbf{p} \end{equation}\] where \(\mathbf{P}\) is the permutation matrix and \(\mathbf{p}=\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\). Thus, \[\begin{equation} \tag{2.26} \mathbf{y}=\mathbf{Z}\mathbf{x}+\mathbf{e} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\begin{bmatrix}\alpha\\ \beta\end{bmatrix} = \mathbf{M}\mathbf{p} + \mathbf{e} \end{equation}\] where \(\mathbf{M}=(\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\). We can solve for \(\mathbf{p}\), the parameters, using \[(\mathbf{M}^\top\mathbf{M})^{-1}\mathbf{M}^\top\mathbf{y}\] as before.

2.9.0.1 Code to solve for parameters in Form 2

In the homework, you will use the R code in this section to solve for the parameters in Form 2.

#make your y and x matrices
y=matrix(dat$stack.loss, ncol=1)
x=matrix(c(1,dat$Air.Flow),ncol=1)
#make the Z matrix
n=nrow(dat) #number of rows in our data file
k=1
#Z has n rows and 1 col for intercept, and n cols for the n air data points
#a list matrix allows us to combine "characters" and numbers
Z=matrix(list(0),n,k*n+1) 
Z[,1]="alpha"
diag(Z[1:n,1+1:n])="beta" 
#this function creates that permutation matrix for you
P=MARSS:::convert.model.mat(Z)$free[,,1]
M=kronecker(t(x),diag(n))%*%P
solve(t(M)%*%M)%*%t(M)%*%y
             [,1]
alpha -11.6159170
beta    0.6412918
coef(lm(dat$stack.loss ~ dat$Air.Flow))
 (Intercept) dat$Air.Flow 
 -11.6159170    0.6412918 

Go through this code line by line at the R command line. Look at Z. It is a list matrix that allows you to combine numbers (the 0s) with character string (names of parameters). Notice that class(Z[1,3])="numeric" while class(Z[1,2])="character". This is important. 0 in R is a number while "0" would be a character (the name of a parameter). Look at the permutation matrix P. Try MARSS:::convert.model.mat(Z)$free and see that it returns a 3D matrix, which is why the [,,1] appears (to get us a 2D matrix). To use more data points, you can redefine dat to say dat=stackloss to use all 21 data points.

Here’s another example. Rewrite the model with multiple intercepts (Equation (2.14) ) as \[\begin{equation} \tag{2.27} \begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix} = \begin{bmatrix} \alpha_n&\beta&0&0&0\\ \alpha_s&0&\beta&0&0\\ \alpha_n&0&0&\beta&0\\ \alpha_s&0&0&0&\beta \end{bmatrix}\begin{bmatrix}1\\air_1\\air_2\\air_3\\air_4\end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{a}+\mathbf{e} \end{equation}\] To estimate the parameters, we need to be able to write a list matrix that looks like \(\mathbf{Z}\) in Equation (2.27). We can use the same code as above with \(\mathbf{Z}\) changed to look like that in Equation (2.27).

y = matrix(dat$stack.loss, ncol = 1)
x = matrix(c(1, dat$Air.Flow), ncol = 1)
n = nrow(dat)
k = 1
# list matrix allows us to combine numbers and character
# strings
Z = matrix(list(0), n, k * n + 1)
Z[seq(1, n, 2), 1] = "alphanorth"
Z[seq(2, n, 2), 1] = "alphasouth"
diag(Z[1:n, 1 + 1:n]) = "beta"
P = MARSS:::convert.model.mat(Z)$free[, , 1]
M = kronecker(t(x), diag(n)) %*% P
solve(t(M) %*% M) %*% t(M) %*% y
                 [,1]
alphanorth -2.0257880
alphasouth -5.5429799
beta        0.5358166

Similarly to estimate the parameters for Equation (2.18), we change the \(\beta\)’s in our \(\mathbf{Z}\) list matrix to have owner designations:

Z = matrix(list(0), n, k * n + 1)
Z[seq(1, n, 2), 1] = "alphanorth"
Z[seq(2, n, 2), 1] = "alphasouth"
diag(Z[1:n, 1 + 1:n]) = rep(c("beta.s", "beta.a"), n)[1:n]
P = MARSS:::convert.model.mat(Z)$free[, , 1]
M = kronecker(t(x), diag(n)) %*% P
solve(t(M) %*% M) %*% t(M) %*% y
            [,1]
alphanorth -38.0
alphasouth  -3.0
beta.s       1.0
beta.a       0.5

The parameters estimates are the same as with the model in Form 1, though \(\beta\)’s are given in reversed order simply due to the way convert.model.mat() is ordering the columns in Form 2’s \(\mathbf{Z}\).