- Quick overview of Stan
- Manipulating and plotting Stan output
- Examples of time series models
7 Feb 2019
Inference: what’s the probability that the data are less than some threshold?
Powerful, cross-platform and cross-language (R, Julia, Matlab, etc) that allows users to write custom code that can be called directly from R
Estimation can be fully or approximate Bayesian inference, or maximum likelihood optimization (BFGS)
Write your own code (based on examples in the manual, etc)
Use an existing package
Use our bundled code to get started with simple models (we’ll start here)
rstan::stan_lm rstan::stan_glm rstan::stan_glmer
brms
includes autocorrelated errors, non-normal data, non-linear smooths, etc.
brms
offers notation that should be very familiar to run many classes of models,
brms::brm(y ~ x * z + (1|group), data=d) brms::brm(y01 ~ x * z + (1|group), data=d, family = binomial("logit")) brms::brm(bf(y ~ s(x)), data=d)
install.packages("rstan", repos = "https://cloud.r-project.org") install.packages("devtools", repos = "https://cloud.r-project.org")
devtools::install_github(repo="nwfsc-timeseries/atsar") install.packages("atsar")
Fitting a regression model to this data doesn’t make a lot of sense, but it will introduce us to the basic functionality of our wrapper functions. We’ll start with a function named fit_stan
library(atsar) lm_intercept = fit_stan(y = as.numeric(Nile), x = rep(1, length(Nile)), model_name = "regression")
The output of the fitted model can be examined a number of ways. Starting with the simple summaries,
lm_intercept
This model ran 3 MCMC chains (the default) with warmups of 500, followed by 500 more iterations. These latter were stored (so 1500 parameter samples total)
But we’re probably more interested in the values for individual parameters. We can pull these out in a few different ways, and plot them with either base graphics or ggplot. As a first option for getting values we can manipulate, we can use the extract
function,
pars = rstan::extract(lm_intercept)
Then we can do all kinds of things with this output, like making a histogram
hist(pars$beta, 40, col="grey", xlab="Intercept", main="")
or calculate summary statistics
quantile(pars$beta, c(0.025,0.5,0.975))
The object lm_intercept
is a stanfit
object, which means there’s a lot of other plotting functionality from the rstan package we can use
rstan::traceplot(lm_intercept, pars = c("beta[1]","sigma"), nrow=2, ncol=1)
rstan::stan_scat(lm_intercept, pars = c("beta[1]","sigma"))
rstan::stan_dens(lm_intercept, pars = c("beta[1]","sigma"))
rstan::stan_hist(lm_intercept, pars = c("beta[1]","sigma"))
library(bayesplot) mcmc_areas(as.matrix(lm_intercept), pars = c("beta[1]","sigma"), prob = 0.8)
rstan::stan_plot(lm_intercept, pars=c("beta[1]","sigma"))
rstan::stan_ac(lm_intercept, pars=c("beta[1]","sigma"))
These plots only the tip of the iceberg for plotting. For more great examples of the kinds of plots avaialable, see these vignettes:
Using the broom
package, we can also extract some tidy summaries of the output
coef = broom::tidy(lm_intercept) head(coef)
## # A tibble: 6 x 3 ## term estimate std.error ## <chr> <dbl> <dbl> ## 1 beta[1] 0.481 1.86 ## 2 sigma 938. 68.9 ## 3 pred[1] 0.481 1.86 ## 4 pred[2] 0.481 1.86 ## 5 pred[3] 0.481 1.86 ## 6 pred[4] 0.481 1.86
These tidy summaries can then be fed into ggplot for example
coef = broom::tidy(lm_intercept) ggplot(coef[grep("pred",coef$term),], aes(x = 1:100,y=estimate)) + geom_point() + ylab("Estimate +/- SE")+ xlab("")+ geom_errorbar(aes(ymin=estimate-std.error, ymax=estimate+std.error))
For models with multiple chains, we might want to preserve the chain ID to look at individual chain diagnostics. Remember that
extract
defaults to merging samples from all chains together, e.g.extract(object, pars, permuted = TRUE)
extract(object, pars, permuted = FALSE)
This model should be familiar,
\[E\left[ { Y }_{ t } \right] =E\left[ { Y }_{ t-1 } \right] +{ e }_{ t-1 }\]
model_name
and est_drift
. By not estimating drift, we asssume the process is stationary with respect to the meandata(airquality) Temp = airquality$Temp # air temperature rw = fit_stan(y = Temp, est_drift = FALSE, model_name = "rw")
Did the model converge?
rw_summary <- summary(rw, pars = c("sigma"), probs = c(0.1, 0.9))$summary print(rw_summary)
## mean se_mean sd 10% 90% n_eff Rhat ## sigma 5.750734 0.01553552 0.3374042 5.338531 6.213481 471.6821 0.9998814
State equation: \[{ x }_{ t }={ \phi x }_{ t-1 }+{ \varepsilon }_{ t-1 }\] where \({ \varepsilon }_{ t-1 } \sim Normal(0, q)\)
Observation equation: \[{ Y }_{ t } \sim Normal(x_{t}, r)\]
We can first run the model with \(\phi\),
ss_ar = fit_stan(y = Temp, est_drift=FALSE, model_name = "ss_ar")
then without,
ss_rw = fit_stan(y = Temp, est_drift=FALSE, model_name = "ss_rw")
Estimates from the AR(1) state space model
Estimates from the RW state space model
Estimates from both models (note the difference in credible interval widths)
We might be also interested in looking at posterior estimates for these models.
rstan::stan_dens(ss_ar, c("phi"))
We might also be interested in derived quantities.
pars = extract(ss_ar) p = length(which(pars$pred > 90)) / length(pars$pred) print(round(p,3))
## [1] 0.078
As an example of Bayesian DFA, we’ll load the WA plankton dataset. There’s 3 versions of the data, and we’ll use the one that’s been transformed. As a reminder,
+ 0s replaced with NAs + data have been z-scored + we only use data from 1980-1989 for simplicity
dat = lakeWAplanktonTrans plankdat = dat[dat[,"Year"]>=1980 & dat[,"Year"]<1990,] phytoplankton = c("Cryptomonas", "Diatoms", "Greens", "Unicells", "Other.algae") dat.spp.1980 = plankdat[,phytoplankton]
For starters, we’ll try a 3-trend model,
mod_3 = fit_dfa(y = t(dat.spp.1980), num_trends=3)
Why? Identifiability
Use function we’ve written, rotate_trends
Z_rot
, rotated Z matrix for each MCMC drawtrends
, rotated trends for each MCMC drawZ_rot_mean
, mean Z across drawstrends_mean
, mean trends across drawstrends_lower
, lower 2.5% on trendstrends_upper
, upper 97.5% on trends
By default, the structure is diagonal and equal
Diagonal and unequal or shared variances can also be specified using varIndx
argument. The diagonal and unequal structure would be called with
mod_3 = fit_dfa(y = t(dat.spp.1980), num_trends=3, varIndx = rep(1,5))
We often just present the trend estimates for DFA – but not uncertainty
Let’s look at effect of missing data on DFA for the harbor seal dataset
data("harborSealWA")
Assume single trend for the population
We’ll extract predictions from the best model,
pars = extract(fit_dfa(y = t(harborSealWA[,-1]), num_trends = 1)) df = data.frame("time"=1:nrow(harborSealWA), "pred"=apply(pars$x[,1,], 2, mean), "low"=apply(pars$x[,1,], 2, quantile,0.025), "hi"=apply(pars$x[,1,], 2, quantile,0.975)) ggplot(df, aes(time, pred)) + geom_ribbon(aes(ymin=low,ymax=hi), fill="grey30", alpha=0.5)+ geom_line()
For comparison to MARSS, we’ll use Mark’s example of logit-transformed survival from the Columbia River. We can think about setting the DLM up in the slope or the intercept. For this first example, we’ll do the latter.
mod = fit_stan(y = SalmonSurvCUI$logit.s, model_name="dlm-intercept")
df = data.frame("year"=1:42, "pred"=apply(extract(mod, "intercept", permuted=FALSE), 3, mean), "low"=apply(extract(mod, "intercept", permuted=FALSE), 3, quantile,0.025), "hi"=apply(extract(mod, "intercept", permuted=FALSE), 3, quantile,0.975)) ggplot(df, aes(year,pred)) + geom_ribbon(aes(ymin=low,ymax=hi),fill="grey30",alpha=0.5) + geom_line() + ylab("Intercept") + xlab("Time")
mod = fit_stan(y = SalmonSurvCUI$logit.s, x = SalmonSurvCUI$CUI.apr, model_name="dlm-slope")
lmmod = lm(SalmonSurvCUI$logit.s ~ SalmonSurvCUI$CUI.apr) x = model.matrix(lmmod)
lmmod = lm(SalmonSurvCUI$logit.s ~ SalmonSurvCUI$CUI.apr) mod = fit_stan(y = SalmonSurvCUI$logit.s, x = model.matrix(lmmod), model_name="dlm"
Bayesian implementation of time series models in Stan can do everything that MARSS can do and more!
Very flexible language, great developer community
Widely used by students in SAFS / UW / QERM / etc
Please come to us with questions or modeling issues!