## 8.8 Homework data and discussion

For these problems, use the following code to load in 1980-1994 phytoplankton data, covariates, and z-score all the data. Run the code below and use `dat`

and `covars`

directly in your code.

```
library(MARSS)
<- c("Cryptomonas", "Diatoms", "Greens", "Unicells", "Other.algae",
spp "Daphnia")
<- lakeWAplanktonTrans[, "Year"] %in% 1980:1994
yrs <- t(lakeWAplanktonTrans[yrs, spp])
dat # z-score the data
<- apply(dat, 1, mean, na.rm = TRUE)
avg <- sqrt(apply(dat, 1, var, na.rm = TRUE))
sd <- (dat - avg)/sd
dat rownames(dat) = spp
# always check that the mean and variance are 1 after
# z-scoring
apply(dat, 1, mean, na.rm = TRUE) #this should be 0
apply(dat, 1, var, na.rm = TRUE) #this should be 1
```

For the covariates, you’ll use temperature and TP.

```
<- rbind(Temp = lakeWAplanktonTrans[yrs, "Temp"], TP = lakeWAplanktonTrans[yrs,
covars "TP"])
<- apply(covars, 1, mean)
avg <- sqrt(apply(covars, 1, var, na.rm = TRUE))
sd <- (covars - avg)/sd
covars rownames(covars) <- c("Temp", "TP")
# always check that the mean and variance are 1 after
# z-scoring
apply(covars, 1, mean, na.rm = TRUE) #this should be 0
apply(covars, 1, var, na.rm = TRUE) #this should be 1
```

Here are some guidelines to help you answer the questions:

- Use a MARSS model that allows for both observation and process error.
- Assume that the observation errors are independent and identically distributed with known variance of 0.10.
- Assume that the process errors are independent from one another, but the variances differ by taxon.
- Assume that each group is an observation of its own process. This means
`Z="identity"`

. - Use
`B="diagonal and unequal"`

. This implies that each of the taxa are operating under varying degrees of density-dependence, and they are not allowed to interact. - All the data have been de-meaned and \(\mathbf{Z}\) is identity, therefore use
`U="zero"`

and`A="zero"`

. Make sure to check that the means of the data are 0 and the variance is 1. - Use
`tinitx=1`

. It makes \(\mathbf{B}\) estimation more stable. It goes in your model list. - Include a plot of residuals versus time and acf of residuals for each question. You only need to show these for the top (best) model if the question involves comparing multiple models.
- Use AICc to compare models.
- Some of the models may not converge, however use for the purpose of the homework, use the unconverged models. Thus use the output from
`MARSS()`

without any additional arguments. If you want, you can try using`control=list(maxit=1000)`

to increase the number of iterations. Or you can try`method="BFGS"`

in your`MARSS()`

call. This will use the BFGS optimization method, however it may throw an error for these data.