Solutions Chapter 7

Here are the answers for the homework problems based on the material in Multivariate state-space models. You can download the Rmd file for this key from here.

For these questions, use the harborSealWA data set in MARSS. The data are already logged, but you will need to remove the year column and have time going across the columns not down the rows.

require(MARSS)
data(harborSealWA, package="MARSS")
dat <- t(harborSealWA[,2:6])

The sites are San Juan de Fuca (SJF 3), San Juan Islands (SJI 4), Eastern Bays (EBays 5), Puget Sound (PSnd 6) and Hood Canal (HC 7).

Problem 1

Plot the harbor seal data.

library(ggplot2)
temp <- as.data.frame(MARSS::harborSealWA)
pdat <- reshape2::melt(temp, id.vars="Year", variable.name = "region")
p <- ggplot(pdat, aes(x=Year, y=value, col=region)) + geom_point() + geom_line() + ggtitle("Puget Sound Harbor Seal Surveys")
p + facet_wrap(~region)

Problem 2

Fit a panmictic population model that assumes that each of the 5 sites is observing one “Inland WA” harbor seal population with trend \(u\). Assume the observation errors are independent and identical. This means 1 variance on diagonal and 0s on off-diagonal. This is the default assumption for MARSS() [MISTAKE in problem; default is diagonal and unequal.]

  1. Write the \(\mathbf{Z}\) for this model.

\[\mathbf{Z}=\begin{bmatrix}1\\1\\1\\1\\1\end{bmatrix}\]

  1. Write the \(\mathbf{Z}\) matrix in R using Z=matrix(...) and using the factor short-cut for specifying \(\mathbf{Z}\). Z=factor(c(...).

    Z = matrix(1, nrow=5)
    Z = factor(rep("inland",5))
  2. Fit the model using MARSS(). What is the estimated trend (\(u\))? How fast was the population increasing (percent per year) based on this estimated \(u\)? Note, the problem incorrectly stated the default R is diagonal and equal. The default is default is diagonal and unequal. If you used the default, you were not counted off.

    fit <- MARSS(dat, model=list(Z=Z, R="diagonal and equal"))
    ## Success! abstol and log-log tests passed at 28 iterations.
    ## Alert: conv.test.slope.tol is 0.5.
    ## Test with smaller values (<0.1) to ensure convergence.
    ## 
    ## MARSS fit is
    ## Estimation method: kem 
    ## Convergence test: conv.test.slope.tol = 0.5, abstol = 0.001
    ## Estimation converged in 28 iterations. 
    ## Log-likelihood: 3.593276 
    ## AIC: 8.813447   AICc: 11.13603   
    ##  
    ##         Estimate
    ## A.SJI    0.80153
    ## A.EBays  0.28245
    ## A.PSnd  -0.54802
    ## A.HC    -0.62665
    ## R.diag   0.04523
    ## U.U      0.04759
    ## Q.Q      0.00429
    ## x0.x0    6.39199
    ## Initial states (x0) defined at t=0
    ## 
    ## Standard errors have not been calculated. 
    ## Use MARSSparamCIs to compute CIs and bias estimates.

    The estimate \(u\) is

    coef(fit)$U
    ##         [,1]
    ## U 0.04759193

    which translates to an approximate 4.75% increase per year.

  3. Compute the confidence intervals for the parameter estimates. Compare the intervals using the Hessian approximation and using a parametric bootstrap. What differences do you see between the two approaches?

    Confidence intervals using the Hessian approximation:

    library(broom)
    cis <- tidy(fit)
    knitr::kable(cis)
    term estimate std.error conf.low conf.high
    A.SJI 0.8015251 0.0708946 0.6625742 0.9404759
    A.EBays 0.2824513 0.0720063 0.1413216 0.4235810
    A.PSnd -0.5480205 0.0855807 -0.7157556 -0.3802854
    A.HC -0.6266450 0.0930656 -0.8090502 -0.4442398
    R.diag 0.0452344 0.0083136 0.0289401 0.0615286
    U.U 0.0475919 0.0154184 0.0173723 0.0778115
    Q.Q 0.0042858 0.0038868 -0.0033321 0.0119038
    x0.x0 6.3919925 0.1211866 6.1544712 6.6295139
    versus par ametric boots trap
    # set nboot low so it doesn't take forever
    cis <- tidy(fit, method="parametric",nboot=100)
    knitr::kable(cis)
    term estimate std.error conf.low conf.high
    A.SJI 0.8015251 0.0648010 0.6763245 0.9126757
    A.EBays 0.2824513 0.0788090 0.1375351 0.4472942
    A.PSnd -0.5480205 0.0904659 -0.7404060 -0.3973756
    A.HC -0.6266450 0.0857850 -0.7878254 -0.4693604
    R.diag 0.0452344 0.0070010 0.0311017 0.0591216
    U.U 0.0475919 0.0164993 0.0186280 0.0833294
    Q.Q 0.0042858 0.0038973 0.0000000 0.0109481
    x0.x0 6.3919925 0.2235052 5.9257591 6.7564686

    The parametric bootstrap is using just 100 simulations so the CIs will be variable, but the CIs are quite similar to the Hessian approximation CIs–except for the variance CIs, in particular the CI for \(\mathbf{Q}\). The Hessian method uses a Normal approximation which is symmetric, but variances are strictly positive. We see that the lower CI for \(\mathbf{Q}\) is negative for the Hessian approximation. The parametric bootstrap uses estimates of \(\mathbf{Q}\) so the estimate will never be negative, though it could be 0.

  4. What does an estimate of \(\mathbf{Q}=0\) mean? What would the estimated state (\(x\)) look like when \(\mathbf{Q}=0\)?

    \(\mathbf{Q}=0\) means that there is no process variance, aka the state process is deterministic. The state process will look like a straight line with slope = \(u\).

Problem 3

Using the same panmictic population model, compare 3 assumptions about the observation error structure.

  • The observation errors are independent with different variances.
  • The observation errors are independent with the same variance.
  • The observation errors are correlated with the same variance and same correlation.
  1. Write the \(\mathbf{R}\) variance-covariance matrices for each assumption.

    The observation errors are independent with different variances.

\[\mathbf{Z}=\begin{bmatrix} r_1&0&0&0&0\\ 0&r_2&0&0&0\\ 0&0&r_3&0&0\\ 0&0&0&r_4&0\\ 0&0&0&0&r_5\end{bmatrix}\]

The observation errors are independent with the same variance.

\[\mathbf{Z}=\begin{bmatrix} r&0&0&0&0\\ 0&r&0&0&0\\ 0&0&r&0&0\\ 0&0&0&r&0\\ 0&0&0&0&r\end{bmatrix}\]

The observation errors are correlated with the same variance and same correlation.

\[\mathbf{Z}=\begin{bmatrix} r&c&c&c&c\\ c&r&c&c&c\\ c&c&r&c&c\\ c&c&c&r&c\\ c&c&c&c&r\end{bmatrix}\]

  1. Create each \(\mathbf{R}\) matrix in R.

    The observation errors are independent with different variances.

    R1 <- matrix(list(0),5,5)
    diag(R1) <- paste0("r",1:5)
    R1
    ##      [,1] [,2] [,3] [,4] [,5]
    ## [1,] "r1" 0    0    0    0   
    ## [2,] 0    "r2" 0    0    0   
    ## [3,] 0    0    "r3" 0    0   
    ## [4,] 0    0    0    "r4" 0   
    ## [5,] 0    0    0    0    "r5"

    The observation errors are independent with the same variance.

    R2 <- matrix(list(0),5,5)
    diag(R2) <- "r"
    R2
    ##      [,1] [,2] [,3] [,4] [,5]
    ## [1,] "r"  0    0    0    0   
    ## [2,] 0    "r"  0    0    0   
    ## [3,] 0    0    "r"  0    0   
    ## [4,] 0    0    0    "r"  0   
    ## [5,] 0    0    0    0    "r"

    The observation errors are correlated with the same variance and same correlation.

    R3 <- matrix("c",5,5)
    diag(R3) <- "r"
    R3
    ##      [,1] [,2] [,3] [,4] [,5]
    ## [1,] "r"  "c"  "c"  "c"  "c" 
    ## [2,] "c"  "r"  "c"  "c"  "c" 
    ## [3,] "c"  "c"  "r"  "c"  "c" 
    ## [4,] "c"  "c"  "c"  "r"  "c" 
    ## [5,] "c"  "c"  "c"  "c"  "r"
  2. Fit each model using MARSS() and compare the estimated \(u\) (the population long-term trend). Does the assumption about the errors change the \(u\) estimate?

    fit1 <- MARSS(dat, model=list(Z=Z, R=R1), silent=TRUE)
    fit2 <- MARSS(dat, model=list(Z=Z, R=R2), silent=TRUE)
    fit3 <- MARSS(dat, model=list(Z=Z, R=R3), silent=TRUE)
    c(coef(fit1)$U, coef(fit2)$U, coef(fit3)$U)
    ## [1] 0.05269531 0.04759193 0.04633784

    Yes, the more structure we allow in the \(\mathbf{R}\) matrix, the lower the \(u\) estimate.

  3. Plot the state residuals, the ACF of the state residuals, and the histogram of the state residuals for each fit. Are there any issues that you see?

    The plot of the residuals have a trend and look autocorrelated.

    resids1 <- residuals(fit1)$state.residuals[1,]
    resids2 <- residuals(fit2)$state.residuals[1,]
    resids3 <- residuals(fit3)$state.residuals[1,]
    par(mfrow=c(3,1), mar=c(1,1,3,1))
    plot(resids1, main="resids1")
    plot(resids2, main="resids2")
    plot(resids3, main="resids3")

    The ACF plots also show the autocorrelation.

    par(mfrow=c(3,1), mar=c(1,1,3,1))
    acf(resids1, na.action=na.pass)
    acf(resids2, na.action=na.pass)
    acf(resids3, na.action=na.pass)

    However, you might notice that the first 5 state residuals have the same values. This is because 1979-1982 (2nd-5th observations) are all NAs. Maybe we should remove those since those would increase the apparent autocorrelation. With these removed, the residuals do not have strong autocorrelation.

    resids1[2:5]=NA
    resids2[2:5]=NA
    resids3[2:5]=NA
    par(mfrow=c(3,1), mar=c(1,1,3,1))
    acf(resids1, na.action=na.pass)
    acf(resids2, na.action=na.pass)
    acf(resids3, na.action=na.pass)

    Histograms and qqplots. Use the residuals with the residuals for points 2-5 set to NA. We don’t have very many points. The histograms do not look wildly non-normal. For example, there aren’t 2 peaks or very obvious long-tails on one side.

    par(mfrow=c(3,2), mar=c(1,1,3,1))
    hist(resids1)
    qqnorm(resids1); qqline(resids1)
    hist(resids2)
    qqnorm(resids2); qqline(resids2)
    hist(resids3)
    qqnorm(resids3); qqline(resids3)

Problem 4

Fit a model with 3 subpopulations. 1=SJF,SJI; 2=PS,EBays; 3=HC. The \(x\) part f the model is the population structure. Assume that the observation errors are identical and independent (R="diagonal and equal"). Assume that the process errors are unique and independent (Q="diagonal and unequal").

  1. Write the \(\mathbf{x}\) equation. Make sure each matrix in the equation has the right number of rows and columns.

    First look at dat to see which area is on which row.

    dat[,1:2]
    ##           [,1] [,2]
    ## SJF   6.033086   NA
    ## SJI   6.747587   NA
    ## EBays 6.626718   NA
    ## PSnd  5.820083   NA
    ## HC    6.595781   NA

    The problem doesn’t state what assumption to use for \(u\). Let’s assume that the \(u\) are diffferent in each subpopulation.

\[\begin{bmatrix} x_1\\ x_2\\ x_3 \end{bmatrix}_t = \begin{bmatrix} 1&0&0\\ 1&0&0\\ 0&1&0\\ 0&1&0\\ 0&0&1 \end{bmatrix}\begin{bmatrix} x_1\\ x_2\\ x_3 \end{bmatrix}_{t+1} + \begin{bmatrix} u_1\\ u_2\\ u_3 \end{bmatrix} + \begin{bmatrix} w_1\\ w_2\\ w_3 \end{bmatrix}_t \\ \begin{bmatrix} w_1\\ w_2\\ w_3 \end{bmatrix}_t \sim \text{MVN}\left( 0, \begin{bmatrix} q_1&0&0\\ 0&q_2&0\\ 0&0&q_3 \end{bmatrix}\right) \]

  1. Write the \(\mathbf{Z}\) matrix.

\[ \mathbf{Z}= \begin{bmatrix} 1&0&0\\ 1&0&0\\ 0&1&0\\ 0&1&0\\ 0&0&1 \end{bmatrix} \]

  1. Write the \(\mathbf{Z}\) in R using Z=matrix(...) and using the factor shortcut Z=factor(c(...)).

    Z4 = matrix(c(1,1,0,0,0,0,0,1,1,0,0,0,0,0,1), 5,3)
    Z4 = factor(c("1","1","2","2","3"))
  2. Fit the model with MARSS().

    fit4 <- MARSS(dat, model=list(Z=Z4, R="diagonal and equal", Q="diagonal and unequal", U="unequal"))
    ## Success! abstol and log-log tests passed at 28 iterations.
    ## Alert: conv.test.slope.tol is 0.5.
    ## Test with smaller values (<0.1) to ensure convergence.
    ## 
    ## MARSS fit is
    ## Estimation method: kem 
    ## Convergence test: conv.test.slope.tol = 0.5, abstol = 0.001
    ## Estimation converged in 28 iterations. 
    ## Log-likelihood: 25.91343 
    ## AIC: -27.82687   AICc: -22.44756   
    ##  
    ##         Estimate
    ## A.SJI    0.79874
    ## A.PSnd  -0.78673
    ## R.diag   0.01280
    ## U.1      0.06968
    ## U.2      0.04698
    ## U.3     -0.00341
    ## Q.(1,1)  0.01347
    ## Q.(2,2)  0.00565
    ## Q.(3,3)  0.04423
    ## x0.1     5.95427
    ## x0.2     6.63400
    ## x0.3     6.60982
    ## Initial states (x0) defined at t=0
    ## 
    ## Standard errors have not been calculated. 
    ## Use MARSSparamCIs to compute CIs and bias estimates.
  3. What do the estimated \(u\) and \(\mathbf{Q}\) imply about the population dynamics in the 3 subpopulations?

    The population growth rate \(u\) is higher the closer to the coast. The Hood Canal region is steady or declining in constrast to the other areas. The \(\mathbf{Q}\) is much higher in Hood Canal in addition to the trend being much lower.

Problem 5

Repeat the fit from Question 4 but assume that the 3 subpopulations covary. Use Q="unconstrained".

fit5 <- MARSS(dat, model=list(Z=Z4, R="diagonal and equal", Q="unconstrained", U="unequal"))
## Success! abstol and log-log tests passed at 175 iterations.
## Alert: conv.test.slope.tol is 0.5.
## Test with smaller values (<0.1) to ensure convergence.
## 
## MARSS fit is
## Estimation method: kem 
## Convergence test: conv.test.slope.tol = 0.5, abstol = 0.001
## Estimation converged in 175 iterations. 
## Log-likelihood: 34.21733 
## AIC: -38.43467   AICc: -29.70739   
##  
##          Estimate
## A.SJI    0.798477
## A.PSnd  -0.783120
## R.diag   0.010854
## U.1      0.070912
## U.2      0.045583
## U.3     -0.002933
## Q.(1,1)  0.014337
## Q.(2,1)  0.008976
## Q.(3,1)  0.000756
## Q.(2,2)  0.007298
## Q.(3,2)  0.009569
## Q.(3,3)  0.049706
## x0.1     5.924523
## x0.2     6.607151
## x0.3     6.606868
## Initial states (x0) defined at t=0
## 
## Standard errors have not been calculated. 
## Use MARSSparamCIs to compute CIs and bias estimates.
  1. What does the estimated \(\mathbf{Q}\) matrix tell you about how the 3 subpopulation covary?

    Qest <- coef(fit5, type="matrix")$Q
    Qest
    ##             [,1]        [,2]        [,3]
    ## [1,] 0.014336886 0.008976162 0.000756386
    ## [2,] 0.008976162 0.007297645 0.009568725
    ## [3,] 0.000756386 0.009568725 0.049705534

    We can compute the correlation matrix as follows.

    Dinv <- diag(1/sqrt(diag(Qest)))
    Dinv%*%Qest%*%Dinv
    ##            [,1]      [,2]       [,3]
    ## [1,] 1.00000000 0.8775501 0.02833438
    ## [2,] 0.87755011 1.0000000 0.50241237
    ## [3,] 0.02833438 0.5024124 1.00000000

    Hood Canal appears to be independent of subpopulation closer to the coast while somewhat correlated to the subpopulation in the inner Puget Sound. The inner Puget Sound subpopulation and the subpopulation close to the coast are highly temporally correlated.

  2. Compare the AICc from the model in Question 4 and the one with Q="unconstrained". Which is more supported?

    c(fit4=fit4$AICc, fit5=fit5$AICc)
    ##      fit4      fit5 
    ## -22.44756 -29.70739

    The model allowing for temporal correlation is much more supported. \(\Delta AICc\) = 7.259839.

  3. Fit the model with Q="equalvarcov". Is this more supported based on AICc?

    fit5c <- MARSS(dat, model=list(Z=Z4, R="diagonal and equal",  Q="equalvarcov", U="unequal"))
    ## Success! abstol and log-log tests passed at 256 iterations.
    ## Alert: conv.test.slope.tol is 0.5.
    ## Test with smaller values (<0.1) to ensure convergence.
    ## 
    ## MARSS fit is
    ## Estimation method: kem 
    ## Convergence test: conv.test.slope.tol = 0.5, abstol = 0.001
    ## Estimation converged in 256 iterations. 
    ## Log-likelihood: 31.94428 
    ## AIC: -41.88856   AICc: -37.41399   
    ##  
    ##           Estimate
    ## A.SJI      0.79849
    ## A.PSnd    -0.78746
    ## R.diag     0.01708
    ## U.1        0.07474
    ## U.2        0.04487
    ## U.3       -0.00924
    ## Q.diag     0.00790
    ## Q.offdiag  0.00789
    ## x0.1       5.93729
    ## x0.2       6.61216
    ## x0.3       6.62690
    ## Initial states (x0) defined at t=0
    ## 
    ## Standard errors have not been calculated. 
    ## Use MARSSparamCIs to compute CIs and bias estimates.

    AICc is much lower = -37.4139886 indicating higher data support. This model has many fewer parameters but still fits the data well.

Problem 6

Develop the following alternative models for the structure of the inland harbor seal population. For each model assume that the observation errors are identical and independent (R="diagonal and equal"). Assume that the process errors covary with equal variance and covariances (Q="equalvarcov").

* model 1: 5 independent subpopulations with unique trends $u$.
* model 2: 5 independent subpopulations with the same trend $u$.
* model 3: 5 independent subpopulations with the same trend in 3 regions: SJF+SJI, PS+EBays, HC.
* model 4: 1 panmictic population (the model from question 2).
* model 5: 3 subpopulations. 1=SJF,SJI, 2=PS,EBays, 3=HC (the model from question 4).
* model 6: 2 subpopulations. 1=SJF,SJI,PS,EBays, 2=HC.
  1. Fit each model using MARSS().

    For model 1, there are a number of ways you could specify Z: diag(1,5), "identity", factor(1:5) or you could leave off since this is the default model for Z. Similarly for U there are a number of equivalent specifications: "unequal", matrix(paste0("u",1:5), ncol=1) or leave off since this is the default for U.

    Q="equalvarcov"
    R="diagonal and equal"
    mod1 <- list(Z=diag(1,5), U="unequal", Q=Q, R=R)
    fit1 <- MARSS(dat, model=mod1, silent=TRUE)

    For model 2, Z is the same as in model 1. For U there are a number of equivalent specifications: "equal" or matrix("u",5, 1).

    mod2 <- list(Z=diag(1,5), U="equal", Q=Q, R=R)
    fit2 <- MARSS(dat, model=mod2, silent=TRUE)

    For model 3, Z is the same as in model 1. The \(u\) are shared across some of the subpopulations.

    u3 <- matrix(c("u1","u1","u2","u2","u3"),5,1)
    mod3 <- list(Z=diag(1,5), U=u3, Q=Q, R=R)
    fit3 <- MARSS(dat, model=mod3, silent=TRUE)

    Model 4 has a Z with one column. We do not need to specify U since it is \(1\times 1\).

    mod4 <- list(Z=matrix(1,5,1), Q=Q, R=R)
    fit4 <- MARSS(dat, model=mod4, silent=TRUE)

    Model 5 has three subpopulations (3 \(x\)’s) and it’s easiest to specify using factor. The question doesn’t specify what to assume for \(u\). Let’s assume \(u\) is different for each subpopulation.

    mod5 <- list(Z=factor(c(1,1,2,2,3)), Q=Q, R=R, U="unequal")
    fit5 <- MARSS(dat, model=mod5, silent=TRUE)

    Model 6 has two subpopulations. The question doesn’t specify what to assume for \(u\). Let’s assume \(u\) is different for each subpopulation.

    mod6 <- list(Z=factor(c(1,1,1,1,2)), Q=Q, R=R, U="unequal")
    fit6 <- MARSS(dat, model=mod6, silent=TRUE)
  2. Prepare a table of each model with a column for the AICc values. And a column for \(\Delta AICc\) (AICc minus the lowest AICc in the group). What is the most supported model?

    models <- paste("model", 1:6)
    aicc <- c(fit1$AICc, fit2$AICc, fit3$AICc, fit4$AICc, fit5$AICc, fit6$AICc)
    restab <- data.frame(model=models, AICc=aicc, Delta.AICc=aicc-min(aicc))
    knitr::kable(restab)
    model AICc Delta.AICc
    model 1 -34.41694 2.9970495
    model 2 -12.02347 25.3905166
    model 3 -37.28112 0.1328697
    model 4 11.13603 48.5500167
    model 5 -37.41399 0.0000000
    model 6 -18.29203 19.1219595

    Models 3 and 5 have equivalent support. These models are very similar as each allow a different trend associated with three areas: 2 regions closest to coast, 2 regions more interior to Puget Sound, and Hood Canal. Note, if you assumed that U="equal" for model 5, you would get a different result.

    • model 3: 5 independent subpopulations with the same trend in 3 regions: SJF+SJI, PS+EBays, HC.
    • model 5: 3 subpopulations. 1=SJF,SJI, 2=PS,EBays, 3=HC

Problem 7

Do diagnostics on the model residuals for the 3 subpopulation model from question 4. Put NAs in the model residuals where there are missing data. Then do the tests on each row of resids.

This is model 5 from Problem 6, which we denoted mod5 and the fit fit5. We compute the model residuals using

resids <- residuals(fit5)$model.residuals
resids[is.na(dat)] <- NA
  1. Plot the model residuals.

    I’ll use ggplot and thus need to rearrange the residuals into a melted dataframe.

    rs <- t(resids)
    rs <- reshape2::melt(rs, varnames=c("t","region"), value.name = "resid")
    p <- ggplot(rs, aes(x=t, y=resid, col=region)) + 
      geom_point() + 
      ggtitle("Model Residuals")
    p + facet_wrap(~region)

  2. Plot the ACF of the model residuals. Use acf(..., na.action=na.pass).

    We could just do a series of acf calls for each row of resids:

    acf(resids[1,], na.action=na.pass)

    If we want to show all the ACF plots, using ggplot, we need to write some code:

    TT <- dim(resids)[2]
    CL <- 1.96/sqrt(TT) #approx critical level
    lacf <- apply(resids, 1, acf, plot=FALSE, na.action=na.pass)
    lacf <- lapply(lacf, function(x){data.frame(lag=x$lag, acf=x$acf)})
    pdat <- reshape2::melt(lacf, id="lag", value.name = "acf")
    p <- ggplot(data = pdat, mapping = aes(x = lag, y = acf)) +
       geom_hline(aes(yintercept = 0)) +
       geom_segment(mapping = aes(xend = lag, yend = 0)) + 
       ggtitle("Model Residuals ACFs") +
       geom_hline(yintercept = CL, color="blue") +
       geom_hline(yintercept = -1*CL, color="blue")
    p + facet_wrap(~L1)

    Note, the forecast package has a function that will plot the ACF and PACF of residuals. We can only plot one row of the residuals at a time.

    forecast::ggtsdisplay(resids[1,],lag.max=30)

  3. Plot the histogram of the model residuals.

    We could do a series of hist calls for each row of resids:

    hist(resids[1,])

    Or write code to make all the histograms with ggplot:

    rs <- t(resids)
    rs <- reshape2::melt(rs, varnames=c("t","region"), value.name = "resid")
    p <- ggplot(rs, aes(resid)) + 
      geom_histogram(bins=5) + 
      ggtitle("Model Residuals")
    p + facet_wrap(~region)

  4. Fit an ARIMA() model to your model residuals using forecast::auto.arima(). Are the best fit models what you want? Note, we cannot use the Augmented Dickey-Fuller or KPSS tests when there are missing values in our residuals time series.

    for(i in 1:5){
      fit <- forecast::auto.arima(resids[i,])
      cat("\n","Residual",i,"\n")
      print(fit)
    }
    ## 
    ##  Residual 1 
    ## Series: resids[i, ] 
    ## ARIMA(0,0,0) with zero mean 
    ## 
    ## sigma^2 estimated as 0.01659:  log likelihood=9.55
    ## AIC=-17.09   AICc=-16.89   BIC=-16
    ## 
    ##  Residual 2 
    ## Series: resids[i, ] 
    ## ARIMA(0,0,0) with zero mean 
    ## 
    ## sigma^2 estimated as 0.008113:  log likelihood=15.98
    ## AIC=-29.96   AICc=-29.76   BIC=-28.87
    ## 
    ##  Residual 3 
    ## Series: resids[i, ] 
    ## ARIMA(0,0,0) with zero mean 
    ## 
    ## sigma^2 estimated as 0.003456:  log likelihood=21.86
    ## AIC=-41.72   AICc=-41.52   BIC=-40.63
    ## 
    ##  Residual 4 
    ## Series: resids[i, ] 
    ## ARIMA(0,0,0) with zero mean 
    ## 
    ## sigma^2 estimated as 0.005105:  log likelihood=8.26
    ## AIC=-14.51   AICc=-14.31   BIC=-13.42
    ## 
    ##  Residual 5 
    ## Series: resids[i, ] 
    ## ARIMA(0,0,0) with zero mean 
    ## 
    ## sigma^2 estimated as 0.01364:  log likelihood=1.78
    ## AIC=-1.56   AICc=-1.36   BIC=-0.47

    Yes all the best fit models are white noise.