Multivariate Generalized Additive Models (GAMs)

Overview of today’s material

Multivariate models in mgcv (Pederson et al. 2019)[https://doi.org/10.7717/peerj.6876]
Flexible dynamic models with mvgam (Clark et al. 2022)[https://doi.org/10.1111/2041-210X.13974]

Multivariate models in `mgcv`

Often we have multiple related response variables
Examples: different locations, species, or strata
Goal: model shared and individual trends across responses while accounting for correlation and partial pooling

Multivariate models in `mgcv`

Lots of ways we can think to model variation by time series
Different intercepts / trends (harbor seal example in MARSS)
Site:year random effects (GLMMs)
Factor smooths (specific to GAMs)

Introducing factor smooths

What is a factor smooth?
A factor smooth is a smooth term in a model where a categorical variable interacts with a continuous predictor
This type of smooth allows modeling of varying relationships between continuous predictors across different levels of a factor

How do we implement these in `mgcv`?

Use the s() function to specify smooth terms.
To specify a factor smooth, we include the factor variable in the formula:

gam(response ~ s(predictor, by = factor_variable), 
    data = dataset)

When by = factor, a separate smooth is estimated for each level

Comparison with continuous case

What are the differences in these models?
s(time, by = x, bs = “cr”)
s(time, by = location)
s(time, by = location, bs = “fs”)

Comparison with continuous case

s(time, by = x, bs = “cr”)

\[x \cdot f(time)\] - te(time, x) or ti(time, x) become non-linear

Comparison with continuous case

What about when by is a factor? [Hint: don’t do this!]
s(time, by = location)
Unique smooths and penalties per location, location parsed to numeric?

Comparison with continuous case

We need to explicitly tell mgcv that we want to use a factor smooth
s(time, by = location, bs = “fs”)
Unique smooths per location
penalty is shared across all levels of the factor (same wiggliness)

Illustrating different models with the harbor seal dataset

data("harborSeal")
long_seal <- as.data.frame(harborSeal) |>
  pivot_longer(cols = starts_with("CoastalEstuaries"):
                 starts_with("Georgia.Strait"), 
               names_to = "location", 
               values_to = "value")

long_seal$location <- as.factor(long_seal$location)

Three models for comparison

Model G: global smooth
Model GS: global smooth, time series - specific smooths
Model S: time series - specific smooths

Model G

Global smooth (shared trend)

fit_G <- gam(value ~ s(Year, k=8, bs="cr"), #<<
           data=long_seal, 
           family="gaussian")

Model G fits

Model G

Global smooth (shared trend)
Time series specific intercepts (n = 12)

fit_G <- gam(value ~ s(Year, k=8, bs="cr") + 
             s(location, k=12, bs="re"), #<<
           data=long_seal, 
           family="gaussian")

Model G fits

Model GS

Global smooth (shared trend)
Time series specific smooths

fit_GS <- gam(value ~ s(Year, k=8, bs="cr") + 
             s(Year, by = location, k = 8, bs="fs") + #<<
               s(location, k=12, bs="re"), 
           data=long_seal, 
           family="gaussian")

Model GS fits

Model S

Time series specific smooths
No global smooth

fit_S <- gam(value ~  
             s(Year, by = location, k = 8, bs="fs") + 
               s(location, k=12, bs="re"), 
           data=long_seal, 
           family="gaussian")

Model S fits

Comparing fits

print(AIC(fit_G, fit_GS, fit_S))

Factor Smooths

Add flexibility for time - series specific smooths
Similar to MARSS model with either shared (single trend) or MARSS model where each time series is a different trend
Wiggliness of both global and factor smooths can be specified

Summarizing these approaches in `mgcv`

Lots of flexibility
We could have used gamm for additional flexibility in random effects
Implements mixed models (year:site random effects, etc)

Transitioning to DGAMs in `mvgam`

DGAMs are an extension of GAMs designed to handle time-varying and dynamic data
Particularly useful for modeling temporal dependencies / nonlinear relationships
Can be used with multivariate time series data
DGAMs incorporate dynamic smooth terms that evolve over time

Why use DGAMs?

GAMs are great, but uncertainty blows up as we get further from data (forecast)
Random walk / AR1 processes also have uncertainty balloon
DGAMs represent a hybrid approach that can work particularly well for forecasting

GAM forecast uncertainty

y ~ s(t, k = 12)
low uncertainty, poor fit

GAM forecast uncertainty

y ~ s(t, k = 50)
good fit, high uncertainty

Mathematically

Remember that GAMs are typically represented as \[ E[Y_t] = \beta_0 + \sum_{i=1}^{I} s(x_{i,t})\] where \(Y_t\) is the response variable at time \(t\), \(\beta_0\) is the intercept, and \(s(x_{i,t})\) are smooth functions of the predictors

Mathematically

Because the response can be non-Gaussian we use a link function \(g^{=1}()\) to relate the mean of the response to the linear predictor

\[ E[Y_t] = g^{=1} \left( \beta_0 + \sum_{i=1}^{I} s(x_{i,t}) \right)\] where \(g\) is the link function (e.g., logit, log, identity)

Mathematically

DGAMs add a dynamic component to this equation, allowing for time-varying smooths

\[ E[Y_t] = g^{-1} \left( \beta_0 + \sum_{i=1}^{I} s(x_{i,t}) + z_{t} \right)\] where \(z_{t} = \psi_{0} + \psi_{1}z_{t-1} + \epsilon_t\) and \(\epsilon_t \sim N(0, \sigma_{\epsilon})\) and \(g^{-1}\) is the inverse link function

Complex latent processes

Random walks can be specified with any dynamic model that we’ve already discussed (AR, MA processes, etc)

\(z_{t} = \psi_{0} + \psi_{1}z_{t-1} + \epsilon_t\)

\(z_{t} = \psi_{0} + \psi_{1}z_{t-1} + \psi_{2}z_{t-2} + \epsilon_t\)

Dynamic factor processes

For multiple time series, there can be > 1 latent trend, and we can estimate factor loadings (DFA). For time series \(j\)

\[ E[Y_{j,t}] = g^{-1} \left( \beta_0 + \sum_{i=1}^{I} s(x_{i,j,t}) + \sum_{m=1}^{M} z_{m,t}\theta_{j} \right)\]

The mvgam Package

The mvgam package provides tools for fitting DGAMs in R
Bayesian estimation with Stan backend
Supports multivariate time series modeling with dynamic smooths
Provides an interface to easily define dynamic terms in the model

Simulated data example with `mvgam`

As expected, this is giving the same result as mgcv
fit <- mvgam(y ~ s(time, k = 50), data = df, family = "gaussian")

Harbor seal example with `mvgam`

Before, we were using factor smooths to model harbor seal trends
We can fit these exact same models (without trend) in mvgam

fit_GS <- mvgam(value ~ s(Year, k=8, bs="cr") +#<< 
             s(Year, by = location, k = 8, bs="fs") + 
               s(location, k=12, bs="re"), 
           data=long_seal, 
           family="gaussian", chains = 1)

Plotting the main trend

plot_mvgam_smooth(fit_GS)

Including dynamic components: coastal upwelling

Jacox MG, CA Edwards, EL Hazen, SJ Bograd (2018) Coastal upwelling revisited: Ekman, Bakun, and improved upwelling indices for the U.S. west coast, J. Geophysical Research, 123(10), 7332-7350.

CUTI estimates of vertical transport near the coast
BEUTI estimates of vertical nitrate flux

Including dynamic components: coastal upwelling

CUTI collected at monthly time steps
1 deg latitude resolution (31 - 47) – we’ll use every other

Including dynamic components: coastal upwelling

print(head(cuti_long))

Including dynamic components: coastal upwelling

for speed, we’ll filter the data to be 2023–2024 (train) and use 2025 for testing

Including dynamic components: coastal upwelling

initially no latent trend – what are all the pieces doing??
estimation is slooooow – so we use data 2023–2025

fit_trend <- mvgam(cuti ~ s(time, bs="cr") + 
             s(time, by = latitude, bs="fs") + 
               s(latitude, k=9, bs="re") + 
                s(month, k = 12, bs="cc"),#<<,
           data=cuti2324, 
           share_obs_params = TRUE,#<<,
           family="gaussian", chains = 1)

Including dynamic components: coastal upwelling

Smooths

plot_mvgam_smooth(fit_trend, smooth = "month")

Including dynamic components: coastal upwelling

Time

plot_mvgam_smooth(fit_trend, smooth = "time")

Including dynamic components: coastal upwelling

Predictions (last 3 points are forecasts)

Including dynamic components: coastal upwelling

Possibly some problematic residuals (autocorrelated)

plot(fit_trend)

Including dynamic components: coastal upwelling

Next we’ll add a trend as a random walk
By default this is creating a separate trend for each time series
May have too many competing smooths etc

fit_trend_rw <- mvgam(cuti ~ s(time, bs="cr") + 
             s(time, by = latitude, bs="fs") + 
               s(latitude, k=9, bs="re") + 
                s(month, k = 12, bs="cc"),
             trend_model = RW(),
           data=cuti_long[cuti_long$year>2022,], 
           share_obs_params = TRUE,#<<,
           family="gaussian", chains = 1)

Including dynamic components: coastal upwelling

ACF now looks a little better (and QQ plot too)

plot(fit_trend_rw)

Including dynamic components: coastal upwelling

plot_mvgam_smooth(fit_trend_rw)

Including dynamic components: coastal upwelling

Each of the trends (here for Latitude 31) is ~ flat

plot_mvgam_trend(fit_trend_rw)

Including dynamic components: coastal upwelling

Some fits to forecasts are better – but training data actually seems worse

Including dynamic components: coastal upwelling

Our smooths and the trend are competing to explain the process
Let’s drop out the main smooth now and let that be explained by the trend

fit_trend_ar1 <- mvgam(cuti ~ s(time, bs="cr") + 
               s(latitude, k=9, bs="re") + 
                s(month, k = 12, bs="cc"),
             trend_model = 'AR1',
           data=cuti_long[cuti_long$year>2022,], 
           share_obs_params = TRUE,
           family="gaussian", chains = 1)

Including dynamic components: coastal upwelling

Trends more flexible than the random walk appear to fit data a little better

Including dynamic components: coastal upwelling

But maybe we’re trying to estimate too many trends
Instead, model trends as dynamic factors (n = 2)

fit_trend_fac <- mvgam(cuti ~ s(time, bs="cr") + 
               s(latitude, k=9, bs="re") + 
                s(month, k = 12, bs="cc"),
             trend_model = 'AR1',
             use_lv = TRUE,#<<,
             n_lv = 2,#<<,
           data=cuti_long[cuti_long$year>2022,], 
           share_obs_params = TRUE,
           family="gaussian", chains = 1)

Including dynamic components: coastal upwelling

What do factors look like?
By default mvdlm models these as AR1
DFA model with AR1 trend + seasonal smooth

plot(fit_trend_fac, type = 'factors')

Including dynamic components: coastal upwelling

What do loadings look like?

pars <- rstan::extract(fit_trend_fac$model_output)
print(apply(pars$lv_coefs, c(2,3), mean))

Including dynamic components: coastal upwelling

Factor model seems to be struggling (maybe more so in middle latitudes)
Factors for this application probably not flexible enough

Summary

Multivariate models in mgcv allow for flexible modeling of multiple time series
Factor smooths provide a way to relationships that differ across different levels of a factor
mvgam extends this approach to include dynamic components, allowing for time-varying smooths and latent trends
For time series DGAMs can be used to model temporal dependencies and nonlinear relationships in multivariate time series data

Comparison with other approaches we’ve learned about

All approaches have tradeoffs between fit to the training data and forecast error
GAM fit and forecast error depends on the number of basis functions used (wiggliness)
MARSS / DFA fit and forecast error depends on the number of factors and how they are modeled

Overview of today’s material

Multivariate models in mgcv

Multivariate models in mgcv

Introducing factor smooths

How do we implement these in mgcv?

Comparison with continuous case

Comparison with continuous case

Comparison with continuous case

Comparison with continuous case

Illustrating different models with the harbor seal dataset

Three models for comparison

Model G

Model G fits

Model G

Model G fits

Model GS

Model GS fits

Model S

Model S fits

Comparing fits

Factor Smooths

Summarizing these approaches in mgcv

Transitioning to DGAMs in mvgam

Why use DGAMs?

GAM forecast uncertainty

GAM forecast uncertainty

Mathematically

Mathematically

Mathematically

Complex latent processes

Dynamic factor processes

The mvgam Package

Simulated data example with mvgam

Harbor seal example with mvgam

Plotting the main trend

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Including dynamic components: coastal upwelling

Summary

Comparison with other approaches we’ve learned about

Multivariate models in `mgcv`

Multivariate models in `mgcv`

How do we implement these in `mgcv`?

Summarizing these approaches in `mgcv`

Transitioning to DGAMs in `mvgam`

Simulated data example with `mvgam`

Harbor seal example with `mvgam`