```{r uss-setup, include=FALSE, purl=FALSE}
knitr::opts_knit$set(unnamed.chunk.label = "uss-")
```
# Univariate state-space models {#chap-univariate-state-space}
\chaptermark{Univariate state-space models}
This chapter will show you how to fit some basic univariate state-space models using the **MARSS** package, the `StructTS()` function, and JAGS code. This chapter will also introduce you to the idea of writing AR(1) models in state-space form.
A script with all the R code in the chapter can be downloaded [here](./Rcode/fitting-univariate-state-space.R). The Rmd for this chapter can be downloaded [here](./Rmds/fitting-univariate-state-space.Rmd).
### Data and packages {-}
All the data used in the chapter are in the **MARSS** package. The other required packages are **stats** (normally loaded by default when starting R), **datasets** and **forecast**. Install the packages, if needed, and load:
```{r uss-loadpackages, results='hide', message=FALSE, warning=FALSE}
library(stats)
library(MARSS)
library(forecast)
library(datasets)
```
To run the JAGS code example (optional), you will also need [JAGS](http://mcmc-jags.sourceforge.net/) installed and the **R2jags**, **rjags** and **coda** R packages. To run the Stan code example (optional), you will need the **rstan** package.
## Fitting a state-space model with MARSS {#sec-uss-fitting-a-state-space-model-with-marss}
The **MARSS** package fits multivariate auto-regressive models of this form:
\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1}+\mathbf{u}+\mathbf{w}_t \text{ where } \mathbf{w}_t \sim \,\text{N}(0,\mathbf{Q}) \\
\mathbf{y}_t = \mathbf{Z}\mathbf{x}_t+\mathbf{a}+\mathbf{v}_t \text{ where } \mathbf{v}_t \sim \,\text{N}(0,\mathbf{R}) \\
\mathbf{x}_0 = \boldsymbol{\mu}
\end{gathered}
(\#eq:uss-marss)
\end{equation}
To fit your time series model with the **MARSS** package, you need to put your model into the form above. The $\mathbf{B}$, $\mathbf{Z}$, $\mathbf{u}$, $\mathbf{a}$, $\mathbf{Q}$, $\mathbf{R}$ and $\boldsymbol{\mu}$ are parameters that are (potentially) estimated. The $\mathbf{y}$ are your data. The $\mathbf{x}$ are the hidden state(s). Everything in bold is a matrix; if it is a small bolded letter, it is a matrix with 1 column.
*Important: In the state-space model equation, $\mathbf{y}$ is always the data and $\mathbf{x}$ is a hidden random walk estimated from the data.*
A basic ``MARSS()`` call looks like
``fit=MARSS(y, model=list(...))``.
The argument ```model``` tells the function what form the parameters take. The list has the elements with the names: ```B```, ```U```, ```Q```, etc. The names correspond to the parameters with the same names in Equation \@ref(eq:uss-marss) except that $\boldsymbol{\mu}$ is called ``x0``. ``tinitx`` indicates whether the initial $\mathbf{x}$ is specified at $t=0$ so $\mathbf{x}_0$ or $t=1$ so $\mathbf{x}_1$.
Here's an example. Let's say we want to fit a univariate AR(1) model observed with error. Here is that model:
\begin{equation}
\begin{gathered}
x_t = b x_{t-1} + w_t \text{ where } \mathbf{w}_t \sim \,\text{N}(0,q) \\
y_t = x_t+v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
x_0 = \mu
\end{gathered}
(\#eq:uss-ar1witherror)
\end{equation}
To fit this with ``MARSS()``, we need to write Equation \@ref(eq:uss-ar1witherror) as Equation \@ref(eq:uss-marss). Equation \@ref(eq:uss-marss) is in MATRIX form. In the model list, the parameters must be written EXACTLY like they would be written for Equation \@ref(eq:uss-marss). For example, ``1`` is the number 1 in R. It is not a matrix:
```{r uss-bad.1}
class(1)
```
If you need a 1 (or 0) in your model, you need to pass in the parameter as a $1 \times 1$ matrix: ``matrix(1)``.
With that mind, our model list for Equation \@ref(eq:uss-ar1witherror) is:
```{r uss-mod.list}
mod.list <- list(
B = matrix(1), U = matrix(0), Q = matrix("q"),
Z = matrix(1), A = matrix(0), R = matrix("r"),
x0 = matrix("mu"), tinitx = 0
)
```
We can simulate some AR(1) plus error data like so
```{r uss-set-seed-invisible, echo=FALSE}
set.seed(123)
```
```{r uss-ar1-w-error}
q <- 0.1
r <- 0.1
n <- 100
y <- cumsum(rnorm(n, 0, sqrt(q))) + rnorm(n, 0, sqrt(r))
```
And then fit with ``MARSS()`` using ``mod.list`` above:
```{r uss-ar1-fit}
fit <- MARSS(y, model = mod.list)
```
If we wanted to fix $q=0.1$, then $\mathbf{Q}=[0.1]$ (a $1 \times 1$ matrix with 0.1). We just change ``mod.list$Q`` and re-fit:
```{r uss-mod.list.1, results='hide'}
mod.list$Q <- matrix(0.1)
fit <- MARSS(y, model = mod.list)
```
## Examples using the Nile river data {#sec-uss-examples-using-the-nile-river-data}
We will use the data from the Nile River (Figure \@ref(fig:uss-plotdata)). We will fit different flow models to the data and compare the models with AIC.
```{r uss-data}
library(datasets)
dat <- Nile
```
(ref:uss-plotdata) The Nile River flow volume 1871 to 1970 (```Nile``` dataset in R).
```{r uss-plotdata, echo=FALSE, fig=TRUE, fig.cap='(ref:uss-plotdata)'}
plot(Nile, ylab = "Flow volume", xlab = "Year")
```
### Flat level model {#sec-uss-flat-level-model}
We will start by modeling these data as a simple average river flow with variability around some level $\mu$.
\begin{equation}
y_t = \mu + v_t \text{ where } v_t \sim \,\text{N}(0,r)
(\#eq:uss-simple-model)
\end{equation}
where $y_t$ is the river flow volume at year $t$.
We can write this model as a univariate state-space model as follows. We use $x_t$ to model the average flow level. $y_t$ is just an observation of this flat $x_t$. Work through $x_1$, $x_2$, $\dots$ starting from $x_0$ to convince yourself that $x_t$ will always equal $\mu$.
\begin{equation}
\begin{gathered}
x_t = 1 \times x_{t-1}+ 0 + w_t \text{ where } w_t \sim \,\text{N}(0,0) \\
y_t = 1 \times x_t + 0 + v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
x_0 = \mu
\end{gathered}
(\#eq:uss-marss-model-0)
\end{equation}
The model is specified as a list as follows:
```{r uss-mod.nile.0, eval=TRUE}
mod.nile.0 <- list(
B = matrix(1), U = matrix(0), Q = matrix(0),
Z = matrix(1), A = matrix(0), R = matrix("r"),
x0 = matrix("mu"), tinitx = 0
)
```
We then fit the model:
```{r uss-fit.data.0, eval=TRUE, results='hide'}
kem.0 <- MARSS(dat, model = mod.nile.0)
```
Output not shown, but here are the estimates and AICc.
```{r uss-coef-mod0}
c(coef(kem.0, type = "vector"), LL = kem.0$logLik, AICc = kem.0$AICc)
```
### Linear trend in flow model {#sec-uss-linear-trend-in-flow-model}
Figure \@ref(fig:uss-plotfit) shows the fit for the flat average river flow model. Looking at the data, we might expect that a declining average river flow would be better. In MARSS form, that model would be:
\begin{equation}
\begin{gathered}
x_t = 1 \times x_{t-1}+ u + w_t \text{ where } w_t \sim \,\text{N}(0,0) \\
y_t = 1 \times x_t + 0 + v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
x_0 = \mu
\end{gathered}
(\#eq:uss-marss-model-1)
\end{equation}
where $u$ is now the average per-year decline in river flow volume. The model is specified as follows:
```{r uss-mod.nile.1, eval=TRUE}
mod.nile.1 <- list(
B = matrix(1), U = matrix("u"), Q = matrix(0),
Z = matrix(1), A = matrix(0), R = matrix("r"),
x0 = matrix("mu"), tinitx = 0
)
```
We then fit the model:
```{r uss-fit.data.1, eval=TRUE, results='hide'}
kem.1 <- MARSS(dat, model = mod.nile.1)
```
Here are the estimates, log-likelihood and AICc:
```{r uss-fit.data.1.coef, eval=TRUE}
c(coef(kem.1, type = "vector"), LL = kem.1$logLik, AICc = kem.1$AICc)
```
Figure \@ref(fig:uss-plotfit) shows the fits for the two models with deterministic models (flat and declining) for mean river flow along with their AICc values (smaller AICc is better). The AICc for the model with a declining river flow is lower by over 20 (which is a lot).
### Stochastic level model {#sec-uss-stochastic-level-model}
Looking at the flow levels, we might suspect that a model that allows the average flow to change would model the data better and we might suspect that there have been sudden, and anomalous, changes in the river flow level.
We will now model the average river flow at year $t$ as a random walk, specifically an autoregressive process which means that average river flow is year $t$ is a function of average river flow in year $t-1$.
\begin{equation}
\begin{gathered}
x_t = x_{t-1}+w_t \text{ where } w_t \sim \,\text{N}(0,q) \\
y_t = x_t+v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
x_0 = \mu
\end{gathered}
(\#eq:uss-random-walk-w-noise)
\end{equation}
As before, $y_t$ is the river flow volume at year $t$. $x_t$ is the mean level.
The model is specified as:
```{r uss-mod.nile.2, eval=TRUE}
mod.nile.2 <- list(
B = matrix(1), U = matrix(0), Q = matrix("q"),
Z = matrix(1), A = matrix(0), R = matrix("r"),
x0 = matrix("mu"), tinitx = 0
)
```
We could also use the text shortcuts to specify the model. Because $\mathbf{R}$ and $\mathbf{Q}$ are $1 \times 1$ matrices, "unconstrained", "diagonal and unequal", "diagonal and equal" and "equalvarcov" will all lead to a $1 \times 1$ matrix with one estimated element. For $\mathbf{a}$ and $\mathbf{u}$, the following shortcut could be used:
```{r uss-mod.nile.not.used, eval=FALSE}
A <- "zero"
U <- "zero"
```
Because $\mathbf{x}_0$ is $1 \times 1$, it could be specified as "unequal", "equal" or "unconstrained".
```{r uss-fit.data.2, eval=TRUE, results='hide'}
kem.2 <- MARSS(dat, model = mod.nile.2)
```
Here are the estimates, log-likelihood and AICc:
```{r uss-fit.data.2.coef, eval=TRUE}
c(coef(kem.2, type = "vector"), LL = kem.2$logLik, AICc = kem.2$AICc)
```
### Stochastic level model with drift {#sec-uss-stochastic-level-model-with-drift}
We can add a drift to term to our random walk; the $u$ in the process model ($x$) is the drift term. This causes the random walk to tend to trend up or down.
\begin{equation}
\begin{gathered}
x_t = x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q) \\
y_t = x_t+v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
x_0 = \mu
\end{gathered}
(\#eq:uss-random-walk-w-noise-w-drift)
\end{equation}
The model is then specified by changing ``U`` to indicate that a $u$ is estimated:
```{r uss-mod.nile.3, eval=TRUE}
mod.nile.3 <- list(
B = matrix(1), U = matrix("u"), Q = matrix("q"),
Z = matrix(1), A = matrix(0), R = matrix("r"),
x0 = matrix("mu"), tinitx = 0
)
```
```{r uss-fit.data.3, eval=TRUE, results='hide'}
kem.3 <- MARSS(dat, model = mod.nile.3)
```
Here are the estimates, log-likelihood and AICc:
```{r uss-fit.data.3.coef, eval=TRUE}
c(coef(kem.3, type = "vector"), LL = kem.3$logLik, AICc = kem.3$AICc)
```
Figure \@ref(fig:uss-plotfit) shows all the models along with their AICc values.
## The StructTS function {#sec-uss-the-structts-function}
The ``StructTS`` function in the **stats** package in R will also fit the stochastic level model:
```{r uss-fit.data.2.structTS, eval=TRUE}
fit.sts <- StructTS(dat, type = "level")
fit.sts
```
The estimates from ``StructTS()`` will be different (though similar) from ``MARSS()`` because ``StructTS()`` uses $x_1 = y_1$, that is the hidden state at $t=1$ is fixed to be the data at $t=1$. That is fine if you have a long data set, but would be disastrous for the short data sets typical in fisheries and ecology.
``StructTS()`` is much, much faster for long time series. The example in ``?StructTS`` is pretty much instantaneous with ``StructTS()`` but takes minutes with the EM algorithm that is the default in ``MARSS()``. With the BFGS algorithm, it is much closer to ``StructTS()``:
```{r uss-fit.data.2.comp, eval=FALSE}
trees <- window(treering, start = 0)
fitts <- StructTS(trees, type = "level")
fitem <- MARSS(trees, mod.nile.2)
fitbf <- MARSS(trees, mod.nile.2, method = "BFGS")
```
Note that ``mod.nile.2`` specifies a univariate stochastic level model so we can use it just fine with other univariate data sets.
In addition, ``fitted(fit.sts)`` where ``fit.sts`` is a fit from ``StructTS()`` is very different than ``fit.marss$states`` from ``MARSS()``.
```{r uss-fit.data.2.fitted.structTS, eval=TRUE}
t <- 10
fitted(fit.sts)[t]
```
is the expected value of $y_{t+1}$ (in this case $y_{11}$ since we set $t=10$) given the data up to $y_t$ (in this case, up to $y_{10}$). It is called the one-step ahead prediction.
We are not going to use the one-step ahead predictions unless we are forecasting or doing cross-validation.
Typically, when we analyze fisheries and ecological data, we want to know the estimate of the state, the $x_t$, given ALL the data (sometimes we might want the estimate of the $y_t$ process given all the data). For example, we might need an estimate of the population size in year 1990 given a time series of counts from 1930 to 2015. We don't want to use only the data up to 1989; we want to use all the information. ``fit.marss$states`` from ``MARSS()`` is the expected value of $x_t$ given all the data. In the MARSS package, this is denoted "xtT".
```{r uss-fit-data-2-xtT, eval=TRUE, results='hide'}
fitted(kem.2, type="xtT") %>% subset(t==11)
```
If you needed the one-step predictions from ``MARSS()``, you can get that using "xtt1".
```{r uss-fit-data-2-xtt1, eval=TRUE, results='hide'}
fitted(kem.2, type="xtt1") %>% subset(t==11)
```
This is the expected value of $x_t$ conditioned on $y_1$ to $y_{t-1}$.
(ref:uss-plotfit) The Nile River flow volume with the model estimated flow rates (solid lines). The bottom model is a stochastic level model, meaning there isn't one level line. Rather the level line is a distribution that has a mean and standard deviation. The solid state line in the bottom plots is the mean of the stochastic level and the 2 standard deviations are shown. The other two models are deterministic level models so the state is not stochastic and does not have a standard deviation.
```{r uss-plotfit, eval=TRUE, echo=FALSE, fig=TRUE, fig.width=5, fig.height=6, fig.cap='(ref:uss-plotfit)'}
library(Hmisc)
par(mfrow = c(4, 1), mar = c(4, 4, 0.5, 0.5), oma = c(1, 1, 1, 1))
x <- seq(tsp(Nile)[1], tsp(Nile)[2], tsp(Nile)[3])
# model 0
plot(Nile, ylab = "Flow volume", xlab = "", xaxp = c(1870, 1970, 10), bty = "L")
minor.tick(nx = 10, ny = 0, tick.ratio = .3)
kem <- kem.0 # model 0 results
lines(x, kem$states[1, ], col = "red", lwd = 2)
legend("topright", paste("model 0, AICc=", format(kem.0$AICc, digits = 1)), bty = "n")
# model 1
plot(Nile, ylab = "Flow volume", xlab = "", xaxp = c(1870, 1970, 10), bty = "n")
minor.tick(nx = 10, ny = 0, tick.ratio = .3)
kem <- kem.1 # model 1 results
lines(x, kem$states[1, ], col = "red", lwd = 2)
legend("topright", paste("model 1, AICc=", format(kem.1$AICc, digits = 1)), bty = "n")
# model 2
plot(Nile, ylab = "Flow volume", xlab = "", xaxp = c(1870, 1970, 10), bty = "L")
minor.tick(nx = 10, ny = 0, tick.ratio = .3)
kem <- kem.2 # model 2 results
lines(x, kem$states[1, ], col = "red", lwd = 2)
lines(1871:1970, kem$states[1, ] - 1.96 * kem$states.se[1, ], col = "red", lty = 2)
lines(1871:1970, kem$states[1, ] + 1.96 * kem$states.se[1, ], col = "red", lty = 2)
legend("topright", paste("model 2, AICc=", format(kem$AICc, digits = 1)), bty = "n")
# model 3
plot(Nile, ylab = "Flow volume", xlab = "", xaxp = c(1870, 1970, 10), bty = "L")
minor.tick(nx = 10, ny = 0, tick.ratio = .3)
kem <- kem.3 # model 2 results
lines(x, kem$states[1, ], col = "red", lwd = 2)
lines(1871:1970, kem$states[1, ] - 1.96 * kem$states.se[1, ], col = "red", lty = 2)
lines(1871:1970, kem$states[1, ] + 1.96 * kem$states.se[1, ], col = "red", lty = 2)
legend("topright", paste("model 3, AICc=", format(kem$AICc, digits = 1)), bty = "n")
```
## Comparing models with AIC and model weights {#sec-uss-comparing-models-with-aic-and-model-weights}
To get the AIC or AICc values for a model fit from a MARSS fit, use ``fit$AIC`` or ``fit$AICc``. The log-likelihood is in ``fit$logLik`` and the number of estimated parameters in ``fit$num.params``. For fits from other functions, try ``AIC(fit)`` or look at the function documentation.
Let's put the AICc values 3 Nile models together:
```{r uss-nile-aics}
nile.aic <- c(kem.0$AICc, kem.1$AICc, kem.2$AICc, kem.3$AICc)
```
Then we calculate the AICc minus the minus AICc in our model set and compute the model weights. $\Delta\text{AIC}$ is the AIC values minus the minimum AIC value in your model set.
```{r uss-nile-delaic}
delAIC <- nile.aic - min(nile.aic)
relLik <- exp(-0.5 * delAIC)
aicweight <- relLik / sum(relLik)
```
And this leads to our model weights table:
```{r uss-aic-table}
aic.table <- data.frame(
AICc = nile.aic,
delAIC = delAIC,
relLik = relLik,
weight = aicweight
)
rownames(aic.table) <- c("flat level", "linear trend", "stoc level", "stoc level w drift")
```
Here the table is printed using ``round()`` to limit the number of digits shown.
```{r uss-aic-table-round}
round(aic.table, digits = 3)
```
One thing to keep in mind when comparing models within a set of models is that the model set needs to include at least one model that can fit the data reasonably well. `Reasonably well' means the model can put a fitted line through the data. Can't all models do that? Definitely, not. For example, the flat-level model cannot put a fitted line through the Nile River data. It is simply impossible. The straight trend model also cannot put a fitted line through the flow data. So if our model set only included flat-level and straight trend, then we might have said that the straight trend model is `best' even though it is just the better of two bad models.
## Basic diagnostics {#sec-uss-basic-diagnostics}
The first diagnostic that you do with any statistical analysis is check that your residuals correspond to your assumed error structure. For a basic residuals diagnostic check for a state-space model, we want to use the 'innovations residuals'. This is the observed data at time $t$ minus the value predicted using the model plus the data up to time $t-1$. Innovations residuals should be Gaussian and temporally independent (no autocorrelation).
`residuals(fit)` will return the innovations residuals as a data frame.
```{r uss-innov-resids0}
head(residuals(kem.0))
```
The innovations residuals should also not be autocorrelated in time. We can check the autocorrelation with the function ``acf()``. The autocorrelation plots are shown in Figure \@ref(fig:uss-acfs). The stochastic level model looks the best in that its innovations residuals are fine.
```{r uss-acf0, fig.show='hide'}
par(mfrow = c(2, 2), mar=c(2,2,4,2))
resids <- residuals(kem.0)
acf(resids$.resids, main = "flat level v(t)", na.action = na.pass)
resids <- residuals(kem.1)
acf(resids$.resids, main = "linear trend v(t)", na.action = na.pass)
resids <- residuals(kem.2)
acf(resids$.resids, main = "stoc level v(t)", na.action = na.pass)
```
(ref:uss-acfs) The model innovations residual acfs for the 3 models.
```{r uss-acfs, echo=FALSE, fig=TRUE, fig.cap='(ref:uss-acfs)'}
par(mfrow = c(2, 2), mar=c(2,2,4,2))
resids <- residuals(kem.0)
acf(resids$.resids, main = "flat level v(t)", na.action = na.pass)
resids <- residuals(kem.1)
acf(resids$.resids, main = "linear trend v(t)", na.action = na.pass)
resids <- residuals(kem.2)
acf(resids$.resids, main = "stoc level v(t)", na.action = na.pass)
```
### Outlier diagnostics {#sec-uss-more-diagnostics}
Another type of residual used in state-space models is smoothation residuals. This residual at time $t$ is conditioned on all the data. Smoothation residuals are used for outlier detection and can help detect anomalous shocks in the data. Smoothation residuals can be autocorrelated but should fluctuate around 0. The should not have a trend. Looking at your smoothation residuals can help you determine if there are fundamental problems with the structure of your model.
We can get the smoothation residuals by passing in `type="tT"` to the `residuals()` call. Figure \@ref(fig:uss-resids) shows the model and state smoothation residuals. The flat level and linear trend models do not have a stochastic state so their state residuals are all 0.
(ref:uss-resids) The model and state smoothations residuals for the first 3 models.
```{r uss-resids, echo=FALSE, fig=TRUE, fig.cap='(ref:uss-resids)'}
par(mfrow = c(3, 2))
resids <- MARSSresiduals(kem.0)
plot(resids$model.residuals[1, ],
ylab = "model residual", xlab = "", main = "flat level"
)
abline(h = 0)
plot(resids$state.residuals[1, ],
ylab = "state residual", xlab = "", main = "flat level"
)
abline(h = 0)
resids <- MARSSresiduals(kem.1)
plot(resids$model.residuals[1, ], ylab = "model residual", xlab = "", main = "linear trend")
abline(h = 0)
plot(resids$state.residuals[1, ], ylab = "state residual", xlab = "", main = "linear trend", ylim = c(-1, 1))
abline(h = 0)
resids <- MARSSresiduals(kem.2)
plot(resids$model.residuals[1, ], ylab = "model residual", xlab = "", main = "stoc level")
abline(h = 0)
plot(resids$state.residuals[1, ], ylab = "state residual", xlab = "", main = "stoc level")
abline(h = 0)
```
The flat and linear trend models show problems. The model smoothation residuals are positive early and then are slightly negative. They should fluctuate around 0 the whole time series. The stochastic level model looks fine. The residuals fluctuate around 0.
The smoothation residuals can also help us look for outliers in the data or outlier shifts in the level (sudden anolmalous changes). If we standardize by the variance of the residuals (divide by the square root of the variance), then the standardized residuals should have an approximate standard normal distribution.
We will look just at the stochastic level model since the other models do not have a stochastic state ($x$). Figure \@ref(fig:uss-resids2) (right panel) shows us that there was a sudden level change around 1902. The Aswan Low Dam was completed in 1902 and changed the mean flow. The Aswan High Dam was completed in 1970 and also affected the flow though not as much. You can see these perturbations in Figure \@ref(fig:uss-plotdata).
(ref:uss-resids2) Standardized smoothation residuals.
```{r uss-resids2, echo=FALSE, fig=TRUE, fig.cap='(ref:uss-resids)'}
par(mfrow = c(1, 2))
resids <- residuals(kem.2, type="tT", standardization="marginal")
mresids <- subset(resids, name == "model")
plot(mresids$t, mresids$.std.resids,
ylab = "model std smoothationl", xlab = "", main = "data outliers"
)
abline(h = 0)
abline(h = c(2,-2), lty=2)
sresids <- subset(resids, name == "state")
plot(sresids$t, sresids$.std.resids, type="l",
ylab = "state std smoothation", xlab = "", main = "sudden level changes"
)
abline(h = 0)
abline(h = c(2,-2), lty=2)
```
\clearpage
## Fitting with JAGS {#sec-uss-fitting-with-jags}
Here we show how to fit the stochastic level model, model 3 Equation \@ref(eq:uss-random-walk-w-noise-w-drift), with JAGS. This is a model where the level is a random walk with drift and the Nile River flow is that level plus error.
```{r uss-jags-data}
library(datasets)
y <- Nile
```
This section requires that you have JAGS installed and the **R2jags**, **rjags** and **coda** R packages loaded.
```{r uss-loadpackages-jags, results='hide', message=FALSE, warning=FALSE}
library(R2jags)
library(rjags)
library(coda)
```
The first step is to write the model for JAGS to a file (filename in ``model.loc``):
```{r uss-jags-model}
model.loc <- "ss_model.txt"
jagsscript <- cat("
model {
# priors on parameters
mu ~ dnorm(Y1, 1/(Y1*100)); # normal mean = 0, sd = 1/sqrt(0.01)
tau.q ~ dgamma(0.001,0.001); # This is inverse gamma
sd.q <- 1/sqrt(tau.q); # sd is treated as derived parameter
tau.r ~ dgamma(0.001,0.001); # This is inverse gamma
sd.r <- 1/sqrt(tau.r); # sd is treated as derived parameter
u ~ dnorm(0, 0.01);
# Because init X is specified at t=0
X0 <- mu
X[1] ~ dnorm(X0+u,tau.q);
Y[1] ~ dnorm(X[1], tau.r);
for(i in 2:TT) {
predX[i] <- X[i-1]+u;
X[i] ~ dnorm(predX[i],tau.q); # Process variation
Y[i] ~ dnorm(X[i], tau.r); # Observation variation
}
}
", file = model.loc)
```
Next we specify the data (and any other input) that the JAGS code needs. In this case, we need to pass in ``dat`` and the number of time steps since that is used in the for loop. We also specify the parameters that we want to monitor. We need to specify at least one, but we will monitor all of them so we can plot them after fitting. Note, that the hidden state is a parameter in the Bayesian context (but not in the maximum likelihood context).
```{r uss-jags-set}
jags.data <- list("Y" = y, "TT" = length(y), Y1 = y[1])
jags.params <- c("sd.q", "sd.r", "X", "mu", "u")
```
Now we can fit the model:
```{r uss-jags-fit, results='hide', message=FALSE, cache=TRUE}
mod_ss <- jags(jags.data,
parameters.to.save = jags.params,
model.file = model.loc, n.chains = 3,
n.burnin = 5000, n.thin = 1, n.iter = 10000, DIC = TRUE
)
```
We can then show the posteriors along with the MLEs from MARSS on top (Figure \@ref(fig:uss-fig-posteriors) ) using the code below.
(ref:uss-fig-posteriors) The posteriors for model 3 with MLE estimates from ``MARSS()`` shown in red.
```{r uss-fig-posteriors, fig=TRUE, fig.cap='(ref:uss-fig-posteriors)', message=FALSE, cache=TRUE}
attach.jags(mod_ss)
par(mfrow = c(2, 2))
hist(mu)
abline(v = coef(kem.3)$x0, col = "red")
hist(u)
abline(v = coef(kem.3)$U, col = "red")
hist(log(sd.q^2))
abline(v = log(coef(kem.3)$Q), col = "red")
hist(log(sd.r^2))
abline(v = log(coef(kem.3)$R), col = "red")
detach.jags()
```
To plot the estimated states ( Figure \@ref(fig:uss-fig-bayesian-states) ), we write a helper function:
```{r uss-jags-plot-states-fun, message=FALSE, warning=FALSE}
plotModelOutput <- function(jagsmodel, Y) {
attach.jags(jagsmodel)
x <- seq(1, length(Y))
XPred <- cbind(apply(X, 2, quantile, 0.025), apply(X, 2, mean), apply(X, 2, quantile, 0.975))
ylims <- c(min(c(Y, XPred), na.rm = TRUE), max(c(Y, XPred), na.rm = TRUE))
plot(Y, col = "white", ylim = ylims, xlab = "", ylab = "State predictions")
polygon(c(x, rev(x)), c(XPred[, 1], rev(XPred[, 3])), col = "grey70", border = NA)
lines(XPred[, 2])
points(Y)
}
```
(ref:uss-fig-bayesian-states) The estimated states from the Bayesian fit along with 95\% credible intervals (black and grey) with the MLE states and 95\% condidence intervals in red.
```{r uss-fig-bayesian-states, echo=TRUE, fig=TRUE, fig.cap='(ref:uss-fig-bayesian-states)', cache=TRUE}
plotModelOutput(mod_ss, y)
lines(kem.3$states[1, ], col = "red")
lines(1.96 * kem.3$states.se[1, ] + kem.3$states[1, ], col = "red", lty = 2)
lines(-1.96 * kem.3$states.se[1, ] + kem.3$states[1, ], col = "red", lty = 2)
title("State estimate and data from\nJAGS (black) versus MARSS (red)")
```
```{r uss-jags-reset, include=FALSE, purl=FALSE}
file.remove("ss_model.txt")
```
## Fitting with Stan {#sec-uss-fitting-with-stan}
Let's fit the same model with Stan using the **rstan** package. If you have not already, you will need to install the **rstan** package. This package depends on a number of other packages which should install automatically when you install **rstan**.
```{r uss-stan-setup, message=FALSE}
library(datasets)
library(rstan)
y <- as.vector(Nile)
```
First we write the model. We could write this to a file (recommended), but for this example, we write as a character object. Though the syntax is different from the JAGS code, it has many similarities. Note, unlike the JAGS, the Stan does **not allow** any NAs in your data. Thus we have to specify the location of the NAs in our data. The Nile data does not have NAs, but we want to write the code so it would work even if there were NAs.
```{r uss-stan-ar-model}
scode <- "
data {
int TT;
int n_pos; // number of non-NA values
int indx_pos[n_pos]; // index of the non-NA values
vector[n_pos] y;
}
parameters {
real x0;
real u;
vector[TT] pro_dev;
real sd_q;
real sd_r;
}
transformed parameters {
vector[TT] x;
x[1] = x0 + u + pro_dev[1];
for(i in 2:TT) {
x[i] = x[i-1] + u + pro_dev[i];
}
}
model {
x0 ~ normal(y[1],10);
u ~ normal(0,2);
sd_q ~ cauchy(0,5);
sd_r ~ cauchy(0,5);
pro_dev ~ normal(0, sd_q);
for(i in 1:n_pos){
y[i] ~ normal(x[indx_pos[i]], sd_r);
}
}
generated quantities {
vector[n_pos] log_lik;
for (i in 1:n_pos) log_lik[i] = normal_lpdf(y[i] | x[indx_pos[i]], sd_r);
}
"
```
Then we call `stan()` and pass in the data, names of parameter we wish to have returned, and information on number of chains, samples (iter), and thinning. The output is verbose (hidden here) and may have some warnings.
```{r uss-stan-fit-model, message=FALSE, warning=FALSE, results='hide', cache=TRUE}
# We pass in the non-NA ys as vector
ypos <- y[!is.na(y)]
n_pos <- sum(!is.na(y)) # number on non-NA ys
indx_pos <- which(!is.na(y)) # index on the non-NAs
mod <- rstan::stan(
model_code = scode,
data = list("y" = ypos, "TT" = length(y), "n_pos" = n_pos, "indx_pos" = indx_pos),
pars = c("sd_q", "x", "sd_r", "u", "x0"),
chains = 3, iter = 1000, thin = 1
)
```
We use `extract()` to extract the parameters from the fitted model and we can plot. The estimated level is `x` and we will plot that with the 95\% credible intervals.
```{r uss-stan-ar-level, fig.cap="Estimated level and 95 percent credible intervals. Blue dots are the actual Nile River levels."}
pars <- rstan::extract(mod)
pred_mean <- apply(pars$x, 2, mean)
pred_lo <- apply(pars$x, 2, quantile, 0.025)
pred_hi <- apply(pars$x, 2, quantile, 0.975)
plot(pred_mean,
type = "l", lwd = 3,
ylim = range(c(pred_mean, pred_lo, pred_hi)), ylab = "Nile River Level"
)
lines(pred_lo)
lines(pred_hi)
points(y, col = "blue")
```
Here is a `ggplot()` version of the plot.
```{r uss-stan-ar-level-ggplot, fig.cap="Estimated level and 95 percent credible intervals"}
library(ggplot2)
nile <- data.frame(y = y, year = 1871:1970)
h <- ggplot(nile, aes(year))
h + geom_ribbon(aes(ymin = pred_lo, ymax = pred_hi), fill = "grey70") +
geom_line(aes(y = pred_mean), size = 1) +
geom_point(aes(y = y), color = "blue") +
labs(y = "Nile River level")
```
We can plot the histogram of the samples against the values estimated via maximum likelihood.
```{r uss-stan-hist-pars, fig.cap="Histogram of the parameter samples versus the estimate (red line) from maximum likelihood."}
par(mfrow = c(2, 2))
hist(pars$x0)
abline(v = coef(kem.3)$x0, col = "red")
hist(pars$u)
abline(v = coef(kem.3)$U, col = "red")
hist(log(pars$sd_q^2))
abline(v = log(coef(kem.3)$Q), col = "red")
hist(log(pars$sd_r^2))
abline(v = log(coef(kem.3)$R), col = "red")
```
## A random walk model of animal movement {#sec-uss-a-simple-random-walk-model-of-animal-movement}
A simple random walk model of movement with drift (directional movement) but no correlation is
\begin{gather}
x_{1,t} = x_{1,t-1} + u_1 + w_{1,t}, \;\; w_{1,t} \sim \,\text{N}(0,\sigma^2_1)\\
x_{2,t} = x_{2,t-1} + u_2 + w_{2,t}, \;\; w_{2,t} \sim \,\text{N}(0,\sigma^2_2)
(\#eq:uss-movement)
\end{gather}
where $x_{1,t}$ is the location at time $t$ along one axis (here, longitude) and $x_{2,t}$ is for another, generally orthogonal, axis (in here, latitude). The parameter $u_1$ is the rate of longitudinal movement and $u_2$ is the rate of latitudinal movement. We add errors to our observations of location:
\begin{gather}
y_{1,t} = x_{1,t} + v_{1,t}, \;\; v_{1,t} \sim \,\text{N}(0,\eta^2_1)\\
y_{2,t} = x_{2,t} + v_{2,t}, \;\; v_{2,t} \sim \,\text{N}(0,\eta^2_2),
(\#eq:uss-observe)
\end{gather}
This model is comprised of two separate univariate state-space models. Note that $y_1$ depends only on $x_1$ and $y_2$ depends only on $x_2$. There are no actual interactions between these two univariate models. However, we can write the model down in the form of a multivariate model using diagonal variance-covariance matrices and a diagonal design ($\mathbf{Z}$) matrix. Because the variance-covariance matrices and $\mathbf{Z}$ are diagonal, the $x_1$:$y_1$ and $x_2$:$y_2$ processes will be independent as intended. Here are Equations \@ref(eq:uss-movement) and \@ref(eq:uss-observe) written as a MARSS model (in matrix form):
\begin{gather}
\begin{bmatrix}x_{1,t}\\x_{2,t}\end{bmatrix}
= \begin{bmatrix}x_{1,t-1}\\x_{2,t-1}\end{bmatrix}
+ \begin{bmatrix}u_1\\u_2\end{bmatrix}
+ \begin{bmatrix}w_{1,t}\\w_{2,t}\end{bmatrix},
\textrm{ } \mathbf{w}_t \sim \,\text{MVN}\begin{pmatrix}0,\begin{bmatrix}\sigma^2_1&0\\0&\sigma^2_2\end{bmatrix} \end{pmatrix} (\#eq:uss-mssmlong-a) \\
\nonumber \\
\begin{bmatrix}y_{1,t}\\y_{2,t}\end{bmatrix}
= \begin{bmatrix}1&0\\0&1\end{bmatrix}
\begin{bmatrix}x_{1,t}\\x_{2,t}\end{bmatrix}
+ \begin{bmatrix}v_{1,t}\\ v_{2,t}\end{bmatrix},
\textrm{ } \mathbf{v}_t \sim \,\text{MVN}\begin{pmatrix}0,\begin{bmatrix}\eta^2_1&0\\0&\eta^2_2\end{bmatrix} \end{pmatrix} (\#eq:uss-mssmlong-b)
\end{gather}
The variance-covariance matrix for $\mathbf{w}_t$ is a diagonal matrix with unequal variances, $\sigma^2_1$ and $\sigma^2_2$. The variance-covariance matrix for $\mathbf{v}_t$ is a diagonal matrix with unequal variances, $\eta^2_1$ and $\eta^2_2$. We can write this succinctly as
\begin{gather}
\mathbf{x}_t = \mathbf{x}_{t-1} + \mathbf{u} + \mathbf{w}_t, \;\; \mathbf{w}_t \sim \,\text{MVN}(0,\mathbf{Q}) \\
\mathbf{y}_t = \mathbf{x}_{t} + \mathbf{v}_t, \;\; \mathbf{v}_t \sim \,\text{MVN}(0,\mathbf{R}).
(\#eq:uss-mssm)
\end{gather}
\clearpage
## Problems {#sec-uss-problems}
1. Write the equations for each of these models: ARIMA(0,0,0), ARIMA(0,1,0), ARIMA(1,0,0), ARIMA(0,0,1), ARIMA(1,0,1). Read the help file for the ``Arima()`` function (in the **forecast** package) if you are fuzzy on the arima notation.
2. The **MARSS** package includes a data set of sharp-tailed grouse in Washington. Load the data to use as follows:
```{r uss-hw1, results='hide'}
library(MARSS)
dat <- log(grouse[, 2])
```
Consider these two models for the data:
* Model 1 random walk with no drift observed with no error
* Model 2 random walk with drift observed with no error
Written as a univariate state-space model, model 1 is
\begin{equation}
\begin{gathered}
x_t = x_{t-1}+w_t \text{ where } w_t \sim \,\text{N}(0,q)\\
x_0 = a \\
y_t = x_t
\end{gathered}
(\#eq:uss-hw2-1)
\end{equation}
Model 2 is almost identical except with $u$ added
\begin{equation}
\begin{gathered}
x_t = x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q)\\
x_0 = a \\
y_t = x_t
\end{gathered}
(\#eq:uss-hw2-2)
\end{equation}
$y$ is the log grouse count in year $t$.
a. Plot the data. The year is in column 1 of ``grouse``.
a. Fit each model using ``MARSS()``.
a. Which one appears better supported given AICc?
a. Load the **forecast** package. Use ``?auto.arima`` to learn what it does. Then use ``auto.arima(dat)`` to fit the data. Next run ``auto.arima(dat, trace=TRUE)`` to see all the ARIMA models that the function compared. Note, ARIMA(0,1,0) is a random walk with b=1. ARIMA(0,1,0) with drift would be a random walk (b=1) with drift (with $u$).
a. Is the difference in the AICc values between a random walk with and without drift comparable between MARSS() and auto.arima()?
Note when using ``auto.arima()``, an AR(1) model of the following form will be fit (notice the $b$): $x_t = b x_{t-1}+w_t$. ``auto.arima()`` refers to this model $x_t = x_{t-1}+w_t$, which is also AR(1) but with $b=1$, as ARIMA(0,1,0). This says that the first difference of the data (that's the 1 in the middle) is a ARMA(0,0) process (the 0s in the 1st and 3rd spots). So ARIMA(0,1,0) means this: $x_t - x_{t-1} = w_t$.
3. Create a random walk with drift time series using ``cumsum()`` and ``rnorm()``. Look at the ``rnorm()`` help file (``?rnorm``) to make sure you know what the arguments to the ``rnorm()`` are.
```{r uss-hw3data}
dat <- cumsum(rnorm(100, 0.1, 1))
```
a. What is the order of this random walk written as ARIMA(p, d, q)? "what is the order" means "what is $p$, $d$, and $q$. Model "order" is how ``arima()`` and ``Arima()`` specify arima models.
a. Fit that model using ``Arima()`` in the **forecast** package. You'll need to specify the arguments ``order`` and ``include.drift``. Use ``?Arima`` to review what that function does if needed.
a. Write out the equation for this random walk as a univariate state-space model. Notice that there is no observation error, but still write this as a state-space model.
a. Fit that model with ``MARSS()``.
a. How are the two estimates from ``Arima()`` and ``MARSS()`` different?
4. The first-difference of ``dat`` used in the previous problem is:
```{r uss-hw3data.diff}
diff.dat <- diff(dat)
```
Use ``?diff`` to check what the ``diff()`` function does.
a. If $x_t$ denotes a time series. What is the first difference of $x$? What is the second difference?
b. What is the $\mathbf{x}$ model for ``diff.dat``? Look at your answer to part (a) and the answer to part (e).
c. Fit ``diff.dat`` using ``Arima()``. You'll need to change the arguments ``order`` and ``include.mean``.
d. Fit with ``MARSS()``. You will need to write the model for ``diff.dat`` as a state-space model. If you've done this right, the estimated parameters using ``Arima()`` and ``MARSS()`` will now be the same.
This question should clue you into the fact that ``Arima()`` is not exactly fitting Equation \@ref(eq:uss-marss). It's very similar, but not quite written that way. By the way, Equation \@ref(eq:uss-marss) is how structural time series observed with error are written (state-space models). To recover the estimates that a function like ``arima()`` or ``Arima()`` returns, you need to write your state-space model in a specific way (as seen above).
5. ``Arima()`` will also fit what it calls an "AR(1) with drift". An AR(1) with drift is NOT this model:
\begin{equation}
x_t = b x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q)
(\#eq:uss-gompertz)
\end{equation}
In the population dynamics literature, this equation is called the Gompertz model and is a type of density-dependent population model.
a. Write R code to simulate Equation \@ref(eq:uss-gompertz). Make $b$ less than 1 and greater than 0. Set $u$ and $x_0$ to whatever you want. You can use a for loop.
a. Plot the trajectories and show that this model does not "drift" upward or downward. It fluctuates about a mean value.
a. Hold $b$ constant and change $u$. How do the trajectories change?
a. Hold $u$ constant and change $b$. Make sure to use a $b$ close to 1 and another close to 0. How do the trajectories change?
a. Do 2 simulations each with the same $w_t$. In one simulation, set $u=1$ and in the other $u=2$. For both simulations, set $x_1 = u/(1-b)$. You can set $b$ to whatever you want as long as $0