vignettes/a01_overview.Rmd
a01_overview.Rmd
For examples, we’re going to use one of the same datasets that are widely documented in the MARSS manual. This data consists of three columns, * year, representing the year of the observation * logit.s, representing logit-transformed survival of Columbia River salmon * CUI.apr, representing coastal upwelling index values in April
## Warning: package 'MARSS' was built under R version 4.3.2
## Warning: package 'ggplot2' was built under R version 4.3.2
## Warning: package 'broom.mixed' was built under R version 4.3.2
## Warning: package 'loo' was built under R version 4.3.2
## This is loo version 2.7.0
## - Online documentation and vignettes at mc-stan.org/loo
## - As of v2.0.0 loo defaults to 1 core but we recommend using as many as possible. Use the 'cores' argument or set options(mc.cores = NUM_CORES) for an entire session.
## year logit.s CUI.apr
## 1 1964 -3.46 57
## 2 1965 -3.32 5
## 3 1966 -3.58 43
## 4 1967 -3.03 11
## 5 1968 -3.61 47
## 6 1969 -3.35 -21
g1 <- ggplot(SalmonSurvCUI, aes(year, logit.s)) +
geom_point() +
geom_line()
g1
g2 <- ggplot(SalmonSurvCUI, aes(year, CUI.apr)) +
geom_point() +
geom_line()
g2
For a first model, we’ll fit the same model used in the
MARSS
manual, with a time varying intercept and time
varying coefficient on CUI. We can specify these time varying effects
using the time_varying
argument. Note an intercept is not
included, like with lm
, glm
, and similar
packages this the intercept is implicitly included. Note that
alternatively you could specify this formula as
time_varying = logit.s ~ 1 + CUI.apr
, where we add the
extra 1
to signify the intercept.
Like the MARSS
example, we also standardize the
covariates,
SalmonSurvCUI$CUI.apr = scale(SalmonSurvCUI$CUI.apr)
fit <- fit_dlm(time_varying = logit.s ~ CUI.apr,
data = SalmonSurvCUI,
chains=1,
iter=1000)
##
## SAMPLING FOR MODEL 'dlm' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.003295 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 32.95 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 317.959 seconds (Warm-up)
## Chain 1: 230.002 seconds (Sampling)
## Chain 1: 547.961 seconds (Total)
## Chain 1:
## Warning: There were 14 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
## https://mc-stan.org/misc/warnings.html#bfmi-low
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#tail-ess
fit <- fit_dlm(time_varying = logit.s ~ CUI.apr,
data = SalmonSurvCUI[-36,],
chains=1,
iter=1000)
##
## SAMPLING FOR MODEL 'dlm' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.003826 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 38.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 263.794 seconds (Warm-up)
## Chain 1: 176.787 seconds (Sampling)
## Chain 1: 440.581 seconds (Total)
## Chain 1:
## Warning: There were 32 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
## https://mc-stan.org/misc/warnings.html#bfmi-low
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: The largest R-hat is NA, indicating chains have not mixed.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#r-hat
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#tail-ess
With only 1000 iterations and 1 chain, we might not expect the model to converge (see additional guidance via Stan developers here: https://mc-stan.org/misc/warnings.html). A couple things to consider:
adapt_delta
. (this will slow the sampling down a little).
If you’re still getting these warnings, the model may be mis-specified
or data not informative.
fit <- fit_dlm(time_varying = logit.s ~ CUI.apr,
data = SalmonSurvCUI,
chains=1,
iter=1000,
control = list(adapt_delta=0.99))
max_treedepth
) depth with the
following
fit <- fit_dlm(time_varying = logit.s ~ CUI.apr,
data = SalmonSurvCUI,
chains=1,
iter=1000,
control = list(max_treedepth=15))
We can extract tidied versions of any of the parameters with
broom.mixed::tidy(fit$fit)
## # A tibble: 181 × 3
## term estimate std.error
## <chr> <dbl> <dbl>
## 1 eta[1] -3.37 0.349
## 2 eta[2] -3.11 0.293
## 3 eta[3] -3.53 0.306
## 4 eta[4] -3.20 0.274
## 5 eta[5] -3.67 0.291
## 6 eta[6] -3.44 0.309
## 7 eta[7] -3.99 0.291
## 8 eta[8] -4.24 0.275
## 9 eta[9] -4.68 0.283
## 10 eta[10] -5.19 0.417
## # ℹ 171 more rows
We also have a helper function for extracting and visualizing trends for time varying parameters.
dlm_trends(fit)
## $plot
##
## $b_varying
## # A tibble: 84 × 5
## term estimate std.error par time
## <chr> <dbl> <dbl> <chr> <int>
## 1 b_varying[1,1] -2.64 0.484 (Intercept) 1
## 2 b_varying[2,1] -3.10 0.295 (Intercept) 2
## 3 b_varying[3,1] -3.11 0.351 (Intercept) 3
## 4 b_varying[4,1] -3.14 0.301 (Intercept) 4
## 5 b_varying[5,1] -3.37 0.350 (Intercept) 5
## 6 b_varying[6,1] -3.63 0.276 (Intercept) 6
## 7 b_varying[7,1] -3.88 0.284 (Intercept) 7
## 8 b_varying[8,1] -4.27 0.272 (Intercept) 8
## 9 b_varying[9,1] -4.71 0.281 (Intercept) 9
## 10 b_varying[10,1] -4.94 0.352 (Intercept) 10
## # ℹ 74 more rows
This function returns a plot and the values used to make the plot
(b_varying
)
As a second example, we could fit a model with a constant effect of
CUI, but a time varying slope. The fit_dlm
function has two
formula parameters, formula
for static effects, and
time_varying
for time varying ones. This model is specified
as
fit <- fit_dlm(time_varying = logit.s ~ 1,
formula = logit.s ~ CUI.apr,
data = SalmonSurvCUI,
chains=1,
iter=1000)
We could do the same with a model that had a time varying effect of CUI, but a time constant slope.
fit <- fit_dlm(time_varying = logit.s ~ CUI.apr,
formula = logit.s ~ 1,
data = SalmonSurvCUI,
chains=1,
iter=1000)