4.10 Problems

We have seen how to do a variety of introductory time series analyses with R. Now it is your turn to apply the information you learned here and in lecture to complete some analyses. You have been asked by a colleague to help analyze some time series data she collected as part of an experiment on the effects of light and nutrients on the population dynamics of phytoplankton. Specifically, after controlling for differences in light and temperature, she wants to know if the natural log of population density can be modeled with some form of ARMA(\(p,q\)) model.

The data are expressed as the number of cells per milliliter recorded every hour for one week beginning at 8:00 AM on December 1, 2014. You can load the data using

data(hourlyphyto, package = "atsalibrary")
phyto_dat <- hourlyphyto

Use the information above to do the following:

  1. Convert phyto_dat, which is a data.frame object, into a ts object. This bit of code might be useful to get you started:
## what day of 2014 is Dec 1st?
date_begin <- as.Date("2014-12-01")
day_of_year <- (date_begin - as.Date("2014-01-01") + 1)
  1. Plot the time series of phytoplankton density and provide a brief description of any notable features.

  2. Although you do not have the actual measurements for the specific temperature and light regimes used in the experiment, you have been informed that they follow a regular light/dark period with accompanying warm/cool temperatures. Thus, estimating a fixed seasonal effect is justifiable. Also, the instrumentation is precise enough to preclude any systematic change in measurements over time (i.e., you can assume \(m_t = 0\) for all \(t\)). Obtain the time series of the estimated log-density of phytoplankton absent any hourly effects caused by variation in temperature or light. (Hint: You will need to do some decomposition.)

  3. Use diagnostic tools to identify the possible order(s) of ARMA model(s) that most likely describes the log of population density for this particular experiment. Note that at this point you should be focusing your analysis on the results obtained in Question 3.

  4. Use some form of search to identify what form of ARMA(\(p,q\)) model best describes the log of population density for this particular experiment. Use what you learned in Question 4 to inform possible orders of \(p\) and \(q\). (Hint: if you use auto.arima(), include the additional argument seasonal = FALSE)

  5. Write out the best model in the form of Equation (4.24) using the underscore notation to refer to subscripts (e.g., write x_t for \(x_t\)). You can round any parameters/coefficients to the nearest hundreth. (\(Hint\): if the mean of the time series is not zero, refer to Eqn 1.27 in the lab handout).