2 May 2023

Overview of today’s material

  • Bayesian estimation
  • Overview of Stan
  • Manipulating and plotting Stan output
  • Examples of time series models

Review of models we’ve used so far

Models

  • Regression
  • ARMA models
  • State Space Models
  • Dynamic Linear Models
  • Dynamic Factor Analysis
  • Multivariate time series models)

Why Bayesian?

  • Complex hierarchical models

    • Non-linear models
    • Hierarchical or shared parameters
    • Non-normal data
    • Prior information
  • Inference: what’s the probability that the data are less than some threshold?

  • No bootstrapping!

    • We get credible intervals for parameters and states simultaneously

Bayesian logic

  • Conditional probability

\[P({\theta}|\textbf{y})P(\textbf{y})=P(\theta)P(\textbf{y}|\theta)\]

\[P({\theta}|\textbf{y})=\frac{P(\theta)P(\textbf{y}|\theta)}{P(\textbf{y})}\]

  • \({P(\textbf{y})}\) is a normalizing constant that we often don’t have to worry about

Bayesian logic

  • Parameters are random, data are fixed

  • \[P({\theta}|\textbf{y})=P(\theta)P(\textbf{y}|\theta)\]

  • \(P({\theta}|\textbf{y})\) is the posterior

  • \(P(\textbf{y}|\theta)\) is the likelihood

  • \(P(\theta)\) is the prior

Bayesian logic

Bayesian logic

  • Difference between posterior and prior represents how much we learn by collecting data
  • Experiment {H, H, T, H, H, T, H, H}

Bayesian mechanics

  • MLE seeks to find the combination of parameters that maximize the likelihood (e.g. find absolute best point)

  • Bayesian estimation uses integration to find the combination of parameters that are best on average

Bayesian mechanics in practice

Estimation

  • Goal of Bayesian estimation in drawing samples from the posterior \(P({\theta}|\textbf{y})\)

  • For very simple models, we can write the analytical solution for the posterior

  • But for 99% of the problems we work on, need to generate samples via simulation

  • Markov chain Monte Carlo

Estimation

Estimation

Estimation

Estimation

Estimation

  • Thousands of proposals later, we have a MCMC chain

Estimation: best practices

  • Run 3-4 MCMC chains in parallel

  • Discard first ~ 10-50% of each MCMC chain as a ‘burn-in’

  • Optionally apply MCMC thinning to reduce autocorrelation

Lots of algorithms for sampling

What is Stan?

  • Powerful, cross-platform and cross-language (R, Julia, Matlab, etc) that allows users to write custom code that can be called directly from R

  • Estimation can be fully or approximate Bayesian inference, or maximum a posteriori optimization (BFGS)

  • Useful links:

Options for using Stan in this class

  • Write your own code (based on examples in the manual, etc)

  • Use an existing package

  • Use our bundled code to get started with simple models (we’ll start here)

Existing packages: rstanarm and brms

  • Both packages very flexible, and allow same syntax as basic lm/glm or lmer models, e.g.
rstan::stan_lm
rstan::stan_glm
rstan::stan_glmer

Existing packages: rstanarm and brms

  • Very flexible brms includes autocorrelated errors, non-normal data, non-linear smooths (GAMs), etc.

  • Limitations related to this class:

  • allows multivariate data, but not multivariate time series models brms example

Existing packages: rstanarm and brms

brms offers notation that should be very familiar to run many classes of models,

brms::brm(y ~ x * z + (1|group), data=d)
brms::brm(y01 ~ x * z + (1|group), data=d, 
          family = binomial("logit"))
brms::brm(bf(y ~ s(x)), data=d)
  • smooths can also be of 2-d models (e.g. spatial models)

Existing packages: rstanarm and brms

brms allows ARMA correlation structures that we’re familiar with,

data("LakeHuron")
LakeHuron <- as.data.frame(LakeHuron)
fit <- brm(x ~ arma(p = 2, q = 1), 
           data = LakeHuron)
  • also includes spatial models (car, sar)
  • does not include these in the context of state space models

Example: linear regression in brms

Example: linear regression and AR(1) models in brms

  • Regression
lm_fit = brms::brm(log(airmiles) ~ year, data=df)
  • Question: how would we change the code to be an AR(1) model?
lm_ar= brms::brm(log(airmiles) ~ arma(p = 1, q = 0), 
                 data=df)
  • Defaults to 4 MCMC chains, 2000 iterations, 1000 burn-in

Example: linear regression and AR(1) models in brms

  • lm_ar is a “brmsfit” object and has a bunch of convenient plotting functions
plot(lm_ar)

Example: linear regression and AR(1) models in brms

  • Pairs plots
pairs(lm_ar)

Example: linear regression and AR(1) models in brms

  • Posterior predictive checks
pp_check(lm_ar)

Example: linear regression and AR(1) models in brms

  • Shinystan
shinystan::launch_shinystan(lm_ar)

Example: linear regression and AR(1) models in brms

  • Additional functionality / diagnostics in bayesplot
mcmc_areas(lm_ar,c("sigma","b_Intercept","ar[1]"))

Plotting with Stan output

Customized models and code for this class

  • We’ll need to install these packages to run Stan,
install.packages("rstan", 
                 repos = "https://cloud.r-project.org")
install.packages("devtools", 
                 repos = "https://cloud.r-project.org")
  • And then we can install our custom package for the class with bundled Stan time series models
devtools::install_github(repo="atsa-es/atsar")
library("atsar")

Models included

  • atsar package includes:
  • RW, AR and MA models (with and without drift)
  • DLMs (intercept, slope, both)
  • State space RW and AR models
  • Flexible families for each model

More time series models: application to NEON EFI Aquatics challenge

  • Daily temperature and oxygen data available from Barco Lake in Florida

‘atsar’ package: random walk and AR(1) models

This model should be familiar,

\[E\left[ { Y }_{ t } \right] =E\left[ { Y }_{ t-1 } \right] +{ e }_{ t-1 }\] * Note that the use of the argument model_name and est_drift. By not estimating drift, we assume the process is stationary with respect to the mean

rw = fit_stan(y = neon$oxygen, 
              est_drift = FALSE, model_name = "rw") 

‘atsar’ package: univariate state space models

  • Specify the MCMC parameters
rw = fit_stan(y = neon$oxygen, 
              est_drift = FALSE, 
              model_name = "rw",
              mcmc_list = list(n_mcmc = 2000, 
              n_burn = 500, 
              n_chain = 3, 
              n_thin = 1)) 

‘atsar’ package: univariate state space models

State equation: \[{ x }_{ t }={ \phi x }_{ t-1 }+{ \varepsilon }_{ t-1 }\] where \({ \varepsilon }_{ t-1 } \sim Normal(0, q)\)

Observation equation: \[{ Y }_{ t } \sim Normal(x_{t}, r)\]

  • Let’s compare models with and without the AR parameter \(\phi\) in the process model

‘atsar’ package: univariate state space models

We can first run the model with \(\phi\),

ss_ar = fit_stan(y = neon$oxygen, 
        est_drift=FALSE, model_name = "ss_ar",
        mcmc_list = list(n_mcmc = 2000, n_chain = 1, 
                         n_thin = 1,n_burn=1000))

then without,

ss_rw = fit_stan(y = neon$oxygen, 
        est_drift=FALSE, model_name = "ss_rw",
        mcmc_list = list(n_mcmc = 2000, n_chain = 1, 
                         n_thin = 1,n_burn=1000))

‘atsar’ package: univariate state space models

Did the models converge?

  • One quick check is to look at the value of R-hat for each parameter (generally should be small, < 1.1 or < 1.05)
rw_summary <- summary(ss_rw, pars = c("sigma_process","sigma_obs"), 
                      probs = c(0.1, 0.9))$summary
print(rw_summary)
##                     mean     se_mean          sd        10%        90%
## sigma_process 0.25050137 0.001921004 0.008951147 0.23891867 0.26150841
## sigma_obs     0.05251124 0.005040720 0.014999464 0.03513455 0.07474426
##                   n_eff     Rhat
## sigma_process 21.712052 1.128660
## sigma_obs      8.854545 1.392187

‘atsar’ package: univariate state space models

  • Calculate maximum Rhat across all parameters,
rhats <- summary(ss_rw)$summary[,"Rhat"]
print(max(rhats))
## [1] 1.401858
  • Reminder: we only ran one chain / 2000 iterations, so overall not bad!

‘atsar’ package: univariate state space models

  • Tidy summaries from Stan output: Using the broom.mixed package, we can also extract some tidy summaries of the output
coef = broom.mixed::tidy(ss_ar)
head(coef)
## # A tibble: 6 × 3
##   term          estimate std.error
##   <chr>            <dbl>     <dbl>
## 1 sigma_process    0.250   0.00845
## 2 pred[1]          8.22    0.0575 
## 3 pred[2]          8.14    0.0600 
## 4 pred[3]          9.02    0.0578 
## 5 pred[4]          8.99    0.0546 
## 6 pred[5]          8.84    0.0536

‘atsar’ package: univariate state space models

  • We can use this to look at predictions versus our data

‘atsar’ package: univariate state space models

  • We can use this to look at predictions versus our data

‘atsar’ package: univariate state space models

  • We can use this to look at predictions versus our data

##‘atsar’ package: raw samples

  • tidy() functions great at summarizing
  • fit_stan() returns ‘stanfit’ object that we can use rstan::extract() on to get raw posterior draws, by chain
pars = extract(ss_ar)
  • returns list of parameters we can access directly, e.g.
summary(pars$sigma_process)

##‘atsar’ package: model selection

  • Best practice is to use Leave One Out Information Criterion (LOOIC) in loo package
  • We can compare the LOOIC from the 2 models (AR vs RW)
loo_ar = (loo::loo(ss_ar))
loo_rw = (loo::loo(ss_rw))

‘atsar’ package: DLMs

  • For comparison to MARSS, we’ll use Mark’s example of logit-transformed survival from the Columbia River. We can think about setting the DLM up in the slope or the intercept. For this first example, we’ll do the latter.

‘atsar’ package: DLMs

  • Fit DLM with random walk in intercept
mod = fit_stan(y = SalmonSurvCUI$logit.s, 
               model_name="dlm-intercept",
               mcmc_list = list(n_mcmc = 2000, 
                                n_chain = 1, n_thin = 1,n_burn=1000))
  • Fit DLM with random walk in slope
mod_slope = fit_stan(y = SalmonSurvCUI$logit.s, 
        x = SalmonSurvCUI$CUI.apr, 
        model_name="dlm-slope",
        mcmc_list = list(n_mcmc = 2000, 
                                n_chain = 1, n_thin = 1,n_burn=1000))

‘atsar’ package: DLMs

Let’s look at predictions using the rstan::extract() function

‘atsar’ package: DLMs

Let’s look at predictions using the rstan::extract() function

Extra extensions

  • family argument in fit_stan allows to have flexible families
  • e.g., fit a Poisson or binomial DLM with
mod = fit_stan(y = SalmonSurvCUI$logit.s, 
               model_name="dlm-intercept",
               family="binomial")
mod = fit_stan(y = SalmonSurvCUI$logit.s, 
               model_name="dlm-intercept",
               family="poisson")

Summary

  • Bayesian implementation of time series models in Stan can do everything that MARSS can do and more!

  • Very flexible language, great developer community

  • Widely used by students in SAFS / UW / QERM / etc

  • Please come to us with questions, modeling issues, or add to code in our packages to make them better!