Characteristics of time series (ts)
What is a ts?
Classifying ts
Trends
Seasonality (periodicity)
28 March 2023
What is a ts?
Classifying ts
Trends
Seasonality (periodicity)
\[ \{ x_1,x_2,x_3,\dots,x_n \} \]
\[ \{ 10,31,27,42,53,15 \} \]
Interval across real time; \(x(t)\)
Discrete time; \(x_t\)
Discrete (eg, total # of fish caught per trawl)
Continuous (eg, salinity, temperature)
Univariate/scalar (eg, total # of fish caught)
Multivariate/vector (eg, # of each spp of fish caught)
Integer (eg, # of fish in 5 min trawl = 2413)
Rational (eg, fraction of unclipped fish = 47/951)
Real (eg, fish mass = 10.2 g)
Complex (eg, cos(2π2.43) + i sin(2π2.43))
Univariate \((x_t)\)
Multivariate \(\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}_t\)
Time series objects have a special designation in R: ts
ts(data, start, end, frequency )
Time series objects have a special designation in R: ts
ts(data, start, end, frequency )
data
should be a vector (univariate)
or a data frame or matrix (multivariate)
Time series objects have a special designation in R: ts
ts(data, start, end, frequency )
start
and end
give the first and last time indices
For monthly series, specify them as c(year, month)
Time series objects have a special designation in R: ts
ts(data, start, end, frequency )
frequency
is the number of observations per unit time
For annual series, frequency = 1
For monthly series, frequency = 12
Time series objects have a special designation in R: ts
ts(data, start, end, deltat )
deltat
is the fraction of the sampling period
For annual series, deltat = 1
For monthly series, deltat = 1/12
set.seed(507) ## annual data dat_1 <- rnorm(30) dat_yr <- ts(dat_1, start = 1991, end = 2020, frequency = 1) ## monthly data dat_2 <- rnorm(30*12) dat_mo <- ts(dat_2, start = c(1991, 1), end = c(2020, 12), frequency = 12)
There is a designated function for plotting ts
objects: plot.ts()
plot.ts(ts_object)
We can specify some additional arguments to plot.ts
plot.ts(dat_yr, ylab = expression(italic(x[t])), las = 1, col = "blue", lwd = 2)
Most statistical analyses are concerned with estimating properties of a population from a sample
For example, we use fish caught in a seine to infer the mean size of fish in a lake
Time series analysis, however, presents a different situation:
Time series analysis, however, presents a different situation:
For example, one can’t observe today’s closing price of Microsoft stock more than once
Thus, conventional statistical procedures, based on large sample estimates, are inappropriate
A time series model for \(\{x_t\}\) is a specification of the joint distributions of a sequence of random variables \(\{X_t\}\), of which \(\{x_t\}\) is thought to be a realization
White noise: \(x_t \sim N(0,1)\)
Random walk: \(x_t = x_{t-1} + w_t,~\text{with}~w_t \sim N(0,1)\)
\(x_t = m_t + s_t + e_t\)
We need a way to extract the so-called signal from the noise
One common method is via “linear filters”
Linear filters can be thought of as “smoothing” the data
Linear filters typically take the form
\[ \hat{m}_t = \sum_{i=-\infty}^{\infty} \lambda_i x_{t+1} \]
For example, a moving average
\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]
For example, a moving average
\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]
If \(a = 1\), then
\[ \hat{m}_t = \frac{1}{3}(x_{t-1} + x_t + x_{t+1}) \]
For example, a moving average
\[ \hat{m}_t = \sum_{i=-a}^{a} \frac{1}{2a + 1} x_{t+i} \]
As \(a\) increases, the estimated trend becomes more smooth
Once we have an estimate of the trend \(\hat{m}_t\), we can estimate \(\hat{s}_t\) simply by subtraction:
\[ \hat{s}_t = x_t - \hat{m}_t \]
Seasonal effect (\(\hat{s}_t\)), assuming \(\lambda = 1/9\)
But, \(\hat{s}_t\) really includes the remainder \(e_t\) as well
\[ \begin{align} \hat{s}_t &= x_t - \hat{m}_t \\ (s_t + e_t) &= x_t - m_t \end{align} \]
So we need to estimate the mean seasonal effect as
\[ \hat{s}_{Jan} = \sum \frac{1}{(N/12)} \{s_1, s_{13}, s_{25}, \dots \} \\ \hat{s}_{Feb} = \sum \frac{1}{(N/12)} \{s_2, s_{14}, s_{26}, \dots \} \\ \vdots \\ \hat{s}_{Dec} = \sum \frac{1}{(N/12)} \{s_{12}, s_{24}, s_{36}, \dots \} \\ \]
Now we can estimate \(e_t\) via subtraction:
\[ \hat{e}_t = x_t - \hat{m}_t - \hat{s}_t \]
Log-transform data
Linear trend