Fitting a Bayesian DFA model

We are going to work with the bayesdfa package (https://fate-ewi.github.io/bayesdfa/) and practice fitting some Bayesian models. We’ll work with the stoplight chart data that we used in lab.

Run this code to install the {bayesdfa} package:

devtools::install_github("fate-ewi/bayesdfa")

Then load the package.

library(bayesdfa)
  1. Practice simulating data
set.seed(1)
sim_dat <- sim_dfa(
  num_trends = 1,
  num_years = 20,
  num_ts = 3
)

The resulting object has the states in x, predicted values in pred, loadings in Z and simulated data with observation error in y_sim.

There’s a number of arguments you can play with in ?sim_dfa, but one we can manipulate is the loadings matrix (loadings_matrix), which is dimensioned as number of time series by number of trends.

loadings <- matrix(c(1.1, 0.0, 0.2, 0.03, -0.5, 0.1), nrow = 3, ncol = 2)
set.seed(1)
sim_dat <- sim_dfa(
  num_trends = 2,
  num_years = 20,
  num_ts = 3,
  loadings_matrix = loadings
)

Plot the states and predicted time series – do these results make sense?

  1. We’ll return to using the stoplight dataset from our lab this week. Please pick a subset of time series to use, and fit a DFA model using the fit_dfa function. You’re certainly welcome to use all time series – but for simplicity, it might be better to restrict your choice to a smaller subset (3-5, maybe from the same group?). Initially let’s fit a model with just a single trend
fit <- fit_dfa(...,
               num_trends = 1,
               iter=1000,
               chains=1,
               ...)

Some things to consider are: - are your data being passed in in wide or long format? (this is the data_shape argument) - did the model appear to converge? [Hint: you can look at the Rhat statistics, or use shinystan::launch_shinystan(fit)]

Next, we’ll extract the Leave One Out Information Criterion (LOOIC). Like AIC, lower is better - but unlike AIC, we can get a standard error of the estimate.

loo(fit)$estimates
  1. Let’s consider adding extreme events. We can do this by letting the trends be modeled as a random walk with Student-t, rather than Gaussian deviations. We turn on the estimation of nu from the Student-t with
fit <- fit_dfa(...,
               num_trends = 1,
               iter=1000,
               chains=1,
               estimate_nu = TRUE,
               ...)

Did the model converge? Are you able to estimate the nu parameter well? If not, one trick is to manually fix nu and still fit the model with Student-t extremes, e.g. 

fit <- fit_dfa(...,
               num_trends = 1,
               iter=1000,
               chains=1,
               estimate_nu = FALSE,
               nu_fixed = 5,
               ...)

Extract the LOOIC values from these models, if they converged. Comparing these estimates to the first model with no extremes, is there support for including the heavy tails?

  1. Next, we’ll explore fitting smooth models that don’t model trends as random walks. Let’s try changing the trend_model argument to ps for penalized regression spline. This is also available in mgcv::gam() or brms. Other options are bs (B-splines) or gp (Gaussian process) models – the ps and bs options tend to be quite a bit faster than the Gaussian process models.
fit <- fit_dfa(...,
               num_trends = 1,
               iter=1000,
               chains=1,
               trend_model = "ps",
               ...)

Extract the LOOIC values from these models, if they converged. Comparing these estimates to the models above, is there support for modeling trends with smooth splines, instead of a random walk?

  1. Increase the number of trends, and try to fit models with 1-3 trends. Do any of these appear to have convergence problems?

Which model is most supported by the data (lowest LOOIC)?

Fitting a DFA model with marssTMB

Install the {marssTMB} package with

install.packages('marssTMB', repos = c('https://atsa-es.r-universe.dev', 'https://cloud.r-project.org'))

Using your stoplight dataset above, fit a DFA model with marssTMB, https://atsa-es.github.io/marssTMB/reference/MARSS_tmb.html

The basic function is just like MARSS,

library(marssTMB)
MARSS_tmb(...)

If you need help, you can look at the documentation

?MARSS_tmb

Question: compare a model with 1 - 3 trends. Instead of using AIC, let’s compare the models in their abilities to forecast the last year of data. Create code to do this – and use RMSE as a validation statistic.