We are going to work with the bayesdfa
package (https://fate-ewi.github.io/bayesdfa/) and practice
fitting some Bayesian models. We’ll work with the stoplight chart data
that we used in lab.
Run this code to install the {bayesdfa} package:
devtools::install_github("fate-ewi/bayesdfa")
Then load the package.
library(bayesdfa)
set.seed(1)
sim_dat <- sim_dfa(
num_trends = 1,
num_years = 20,
num_ts = 3
)
The resulting object has the states in x
, predicted
values in pred
, loadings in Z
and simulated
data with observation error in y_sim
.
There’s a number of arguments you can play with in
?sim_dfa
, but one we can manipulate is the loadings matrix
(loadings_matrix
), which is dimensioned as number of time
series by number of trends.
loadings <- matrix(c(1.1, 0.0, 0.2, 0.03, -0.5, 0.1), nrow = 3, ncol = 2)
set.seed(1)
sim_dat <- sim_dfa(
num_trends = 2,
num_years = 20,
num_ts = 3,
loadings_matrix = loadings
)
Plot the states and predicted time series – do these results make sense?
stoplight
dataset from our
lab this week. Please pick a subset of time series to use, and fit a DFA
model using the fit_dfa
function. You’re certainly welcome
to use all time series – but for simplicity, it might be better to
restrict your choice to a smaller subset (3-5, maybe from the same
group?). Initially let’s fit a model with just a single trendfit <- fit_dfa(...,
num_trends = 1,
iter=1000,
chains=1,
...)
Some things to consider are: - are your data being passed in in
wide
or long
format? (this is the
data_shape
argument) - did the model appear to converge?
[Hint: you can look at the Rhat statistics, or use
shinystan::launch_shinystan(fit)]
Next, we’ll extract the Leave One Out Information Criterion (LOOIC). Like AIC, lower is better - but unlike AIC, we can get a standard error of the estimate.
loo(fit)$estimates
nu
from the
Student-t withfit <- fit_dfa(...,
num_trends = 1,
iter=1000,
chains=1,
estimate_nu = TRUE,
...)
Did the model converge? Are you able to estimate the nu
parameter well? If not, one trick is to manually fix nu
and
still fit the model with Student-t extremes, e.g.
fit <- fit_dfa(...,
num_trends = 1,
iter=1000,
chains=1,
estimate_nu = FALSE,
nu_fixed = 5,
...)
Extract the LOOIC values from these models, if they converged. Comparing these estimates to the first model with no extremes, is there support for including the heavy tails?
trend_model
argument
to ps
for penalized regression spline. This is also
available in mgcv::gam()
or brms
. Other
options are bs
(B-splines) or gp
(Gaussian
process) models – the ps
and bs
options tend
to be quite a bit faster than the Gaussian process models.fit <- fit_dfa(...,
num_trends = 1,
iter=1000,
chains=1,
trend_model = "ps",
...)
Extract the LOOIC values from these models, if they converged. Comparing these estimates to the models above, is there support for modeling trends with smooth splines, instead of a random walk?
Which model is most supported by the data (lowest LOOIC)?
Install the {marssTMB} package with
install.packages('marssTMB', repos = c('https://atsa-es.r-universe.dev', 'https://cloud.r-project.org'))
Using your stoplight
dataset above, fit a DFA model with
marssTMB, https://atsa-es.github.io/marssTMB/reference/MARSS_tmb.html
The basic function is just like MARSS,
library(marssTMB)
MARSS_tmb(...)
If you need help, you can look at the documentation
?MARSS_tmb
Question: compare a model with 1 - 3 trends. Instead of using AIC, let’s compare the models in their abilities to forecast the last year of data. Create code to do this – and use RMSE as a validation statistic.