4.2 Decomposition of time series
Plotting time series data is an important first step in analyzing their various components. Beyond that, however, we need a more formal means for identifying and removing characteristics such as a trend or seasonal variation. As discussed in lecture, the decomposition model reduces a time series into 3 components: trend, seasonal effects, and random errors. In turn, we aim to model the random errors as some form of stationary process.
Let’s begin with a simple, additive decomposition model for a time series \(x_t\)
\[\begin{equation} \tag{4.1} x_t = m_t + s_t + e_t, \end{equation}\]
where, at time \(t\), \(m_t\) is the trend, \(s_t\) is the seasonal effect, and \(e_t\) is a random error that we generally assume to have zero-mean and to be correlated over time. Thus, by estimating and subtracting both \(\{m_t\}\) and \(\{s_t\}\) from \(\{x_t\}\), we hope to have a time series of stationary residuals \(\{e_t\}\).
4.2.1 Estimating trends
In lecture we discussed how linear filters are a common way to estimate trends in time series. One of the most common linear filters is the moving average, which for time lags from \(-a\) to \(a\) is defined as
\[\begin{equation} \tag{4.2} \hat{m}_t = \sum_{k=-a}^{a} \left(\frac{1}{1+2a}\right) x_{t+k}. \end{equation}\]
This model works well for moving windows of odd-numbered lengths, but should be adjusted for even-numbered lengths by adding only \(\frac{1}{2}\) of the 2 most extreme lags so that the filtered value at time \(t\) lines up with the original observation at time \(t\). So, for example, in a case with monthly data such as the atmospheric CO\(_2\) concentration where a 12-point moving average would be an obvious choice, the linear filter would be
\[\begin{equation} \tag{4.3} \hat{m}_t = \frac{\frac{1}{2}x_{t-6} + x_{t-5} + \dots + x_{t-1} + x_t + x_{t+1} + \dots + x_{t+5} + \frac{1}{2}x_{t+6}}{12} \end{equation}\]
It is important to note here that our time series of the estimated trend \(\{\hat{m}_t\}\) is actually shorter than the observed time series by \(2a\) units.
Conveniently, R has the built-in function filter()
in the stats package for estimating moving-average (and other) linear filters. In addition to specifying the time series to be filtered, we need to pass in the filter weights (and 2 other arguments we won’t worry about here–type ?filter
to get more information). The easiest way to create the filter is with the rep()
function:
## weights for moving avg
<- c(1/2, rep(1, times = 11), 1/2)/12 fltr
Now let’s get our estimate of the trend \(\{\hat{m}\}\) with filter()
} and plot it:
## estimate of trend
<- stats::filter(co2, filter = fltr, method = "convo",
co2_trend sides = 2)
## plot the trend
plot.ts(co2_trend, ylab = "Trend", cex = 1)
The trend is a more-or-less smoothly increasing function over time, the average slope of which does indeed appear to be increasing over time as well (Figure 4.3).
4.2.2 Estimating seasonal effects
Once we have an estimate of the trend for time \(t\) (\(\hat{m}_t\)) we can easily obtain an estimate of the seasonal effect at time \(t\) (\(\hat{s}_t\)) by subtraction
\[\begin{equation} \tag{4.4} \hat{s}_t = x_t - \hat{m}_t, \end{equation}\]
which is really easy to do in R:
## seasonal effect over time
<- co2 - co2_trend co2_seas
This estimate of the seasonal effect for each time \(t\) also contains the random error \(e_t\), however, which can be seen by plotting the time series and careful comparison of Equations (4.1) and (4.4).
## plot the monthly seasonal effects
plot.ts(co2_seas, ylab = "Seasonal effect", xlab = "Month", cex = 1)
We can obtain the overall seasonal effect by averaging the estimates of \(\{\hat{s}_t\}\) for each month and repeating this sequence over all years.
## length of ts
<- length(co2_seas)
ll ## frequency (ie, 12)
<- frequency(co2_seas)
ff ## number of periods (years); %/% is integer division
<- ll%/%ff
periods ## index of cumulative month
<- seq(1, ll, by = ff) - 1
index ## get mean by month
<- numeric(ff)
mm for (i in 1:ff) {
<- mean(co2_seas[index + i], na.rm = TRUE)
mm[i]
}## subtract mean to make overall mean = 0
<- mm - mean(mm) mm
Before we create the entire time series of seasonal effects, let’s plot them for each month to see what is happening within a year:
## plot the monthly seasonal effects
plot.ts(mm, ylab = "Seasonal effect", xlab = "Month", cex = 1)
It looks like, on average, that the CO\(_2\) concentration is highest in spring (March) and lowest in summer (August) (Figure 4.5). (Aside: Do you know why this is?)
Finally, let’s create the entire time series of seasonal effects \(\{\hat{s}_t\}\):
## create ts object for season
<- ts(rep(mm, periods + 1)[seq(ll)], start = start(co2_seas),
co2_seas_ts frequency = ff)
4.2.3 Completing the model
The last step in completing our full decomposition model is obtaining the random errors \(\{\hat{e}_t\}\), which we can get via simple subtraction
\[\begin{equation} \tag{4.5} \hat{e}_t = x_t - \hat{m}_t - \hat{s}_t. \end{equation}\]
Again, this is really easy in R:
## random errors over time
<- co2 - co2_trend - co2_seas_ts co2_err
Now that we have all 3 of our model components, let’s plot them together with the observed data \(\{x_t\}\). The results are shown in Figure 4.6.
## plot the obs ts, trend & seasonal effect
plot(cbind(co2, co2_trend, co2_seas_ts, co2_err), main = "",
yax.flip = TRUE)
4.2.4 Using decompose()
for decomposition
Now that we have seen how to estimate and plot the various components of a classical decomposition model in a piecewise manner, let’s see how to do this in one step in R with the function decompose()
, which accepts a ts object as input and returns an object of class decomposed.ts.
## decomposition of CO2 data
<- decompose(co2) co2_decomp
co2_decomp
is a list with the following elements, which should be familiar by now:
x
: the observed time series \(\{x_t\}\)seasonal
: time series of estimated seasonal component \(\{\hat{s}_t\}\)figure
: mean seasonal effect (length(figure) == frequency(x)
)trend
: time series of estimated trend \(\{\hat{m}_t\}\)random
: time series of random errors \(\{\hat{e}_t\}\)type
: type of error ("additive"
or"multiplicative"
)
We can easily make plots of the output and compare them to those in Figure 4.6:
## plot the obs ts, trend & seasonal effect
plot(co2_decomp, yax.flip = TRUE)
The results obtained with decompose()
(Figure 4.7) are identical to those we estimated previously.
Another nice feature of the decompose()
function is that it can be used for decomposition models with multiplicative (i.e., non-additive) errors (e.g., if the original time series had a seasonal amplitude that increased with time). To do, so pass in the argument type = "multiplicative"
, which is set to type = "additive"
by default.