12.4 Non-Gaussian observation errors

12.4.1 Poisson observation errors

So far we have used the following observation model \(y_t \sim \,\text{N}(x_t, r)\). We can change this to a Poisson observation error model: \(Y_t \sim \text{Pois}(\lambda_t)\) where \(E[Y_t] = \lambda_t\). \(\text{log}(\lambda_t) = x_t\) where \(x_t\) is our process model.

All we need to change to allow Poisson errors is to change the Y[t] part to

log(EY[t]) <- X[i]
Y[t] ~ dpois(EY[t])

We also need to ensure that our data are integers and we remove the r part from our model code since the Poisson does not have that.

Our univariate state-space code with Poisson observation errors is the following:

# SS MODEL with Poisson errors

model.loc <- ("ss_model_pois.txt")
jagsscript <- cat("
model {  
   # priors on parameters
   u ~ dnorm(0, 0.01); 
   inv.q ~ dgamma(0.001,0.001); 
   q <- 1/inv.q;
   X0 ~ dnorm(0, 0.001);
   
   # likelihood
   X[1] ~ dnorm(X0 + u, inv.q);
   log(EY[1]) <- X[1]
   Y[1] ~ dpois(EY[1])
   for(t in 2:N) {
      X[t] ~ dnorm(X[t-1] + u, inv.q);
      log(EY[t]) <- X[t]
      Y[t] ~ dpois(EY[t]); 
   }
}  
", 
    file = model.loc)

We will fit this to the wild dogs data in the MARSS package.

data(wilddogs, package = "MARSS")
jags.data <- list(Y = wilddogs[, 2], N = nrow(wilddogs))
jags.params <- c("q", "EY", "u")
mod_ss <- jags(jags.data, parameters.to.save = jags.params, model.file = model.loc, 
    n.chains = 3, n.burnin = 5000, n.thin = 1, n.iter = 10000, 
    DIC = TRUE)

When we use this univariate state-space model with population data, like the wild dogs, we would log the data\(^\dagger\), and our \(y_t\) in our JAGS code is really \(log(y_t)\). In that case, \(E[log(Y_t)]) = f(x_t)\). So there is a log-link that we are not really explicit about when we pass in the log of our data. In the Poisson model, that log relationship is explicit, aka we specify \(log(E[Y_t]) = x_t\) and we pass in the raw count data not the log of the data.

\(\dagger\) Why would we typically log population data in this case? Because we would typically think of population processes as multiplicative. Population size at time \(t\) is growth times population size at time \(t-1\). By logging the data, we convert to an additive process. Log population size at time \(t\) is log growth plus log population size at time \(t-1\).

12.4.2 Negative binomial observation errors

In the Poisson distribution, the mean and variance are the same. Using the negative binomial distribution, we can relax that assumption and allow the mean and variance to be different. The negative binomial distribution has two parameters, \(r\) and \(p\). \(r\) is the dispersion parameter. As \(r \rightarrow \infty\), the distribution becomes the Poisson distribution and when \(r\) is small, the distribution is overdispersed (higher variance) relative to the Poisson. In practice, \(r > 30\) is going to be very close to the Poisson. \(p\) is the success parameter, \(p = r/(r+E[Y_t])\). As for the Poisson, \(log E[Y_{t}] = x_t\)—for the univariate state-space model in this example with one state, \(z=1\) and \(a=0\).

To allow negative binomial errors we change the Y[t] part to

log(EY[t]) <- X[t]
p[t] <- r/(r + EY[t])
Y[t] ~ dnegbin(p[t], r)

Now that we have \(r\) again in the model, we will need to put a prior on it. \(r\) is positive and 50 is close to infinity. The following is a sufficiently vague prior.

r ~ dunif(0,50)

Our univariate state-space code with negative binomial observation errors is the following:

# SS MODEL with negative binomial errors

model.loc <- ("ss_model_negbin.txt")
jagsscript <- cat("
model {  
   # priors on parameters
   u ~ dnorm(0, 0.01); 
   inv.q ~ dgamma(0.001,0.001); 
   q <- 1/inv.q;
   r ~ dunif(0,50);
   X0 ~ dnorm(0, 0.001);
   
   # likelihood
   X[1] ~ dnorm(X0 + u, inv.q);
   log(EY[1]) <- X[1]
   p[1] <- r/(r + EY[1])
   Y[1] ~ dnegbin(p[1], r)
   for(t in 2:N) {
      X[t] ~ dnorm(X[t-1] + u, inv.q);
      log(EY[t]) <- X[t]
      p[t] <- r/(r + EY[t])
      Y[t] ~ dnegbin(p[t], r)
   }
}  
", 
    file = model.loc)

We will fit this to the wild dogs data in the MARSS package.

data(wilddogs, package = "MARSS")
jags.data <- list(Y = wilddogs[, 2], N = nrow(wilddogs))
jags.params <- c("q", "EY", "u", "r")
mod_ss <- jags(jags.data, parameters.to.save = jags.params, model.file = model.loc, 
    n.chains = 3, n.burnin = 5000, n.thin = 1, n.iter = 10000, 
    DIC = TRUE)