2.6 Seasonal effect as a factor
Let’s imagine that the data were taken consecutively in time by quarter. We want to model the seasonal effect as an intercept change. We will drop all other effects for now. If the data were collected in quarter 1, the model is \[\begin{equation} \tag{2.19} stack.loss_i = \alpha_1 + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2) \end{equation}\] If collected in quarter 2, the model is \[\begin{equation} \tag{2.20} stack.loss_i = \alpha_2 + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2) \end{equation}\] etc.
We add a column to our dataframe to account for season:
= cbind(dat, qtr = paste(rep("qtr", 4), 1:4, sep = ""))
dat dat
Air.Flow Water.Temp Acid.Conc. stack.loss reg owner qtr
1 80 27 89 42 n s qtr1
2 80 27 88 37 s a qtr2
3 75 25 90 37 n s qtr3
4 62 24 87 28 s a qtr4
And we can easily fit this model with lm()
.
coef(lm(stack.loss ~ -1 + qtr, data = dat))
qtrqtr1 qtrqtr2 qtrqtr3 qtrqtr4
42 37 37 28
The -1 is added to the lm()
call to get rid of \(\alpha\). We just want the \(\alpha_1\), \(\alpha_2\), etc. intercepts coming from our quarters.
For comparison look at
coef(lm(stack.loss ~ qtr, data = dat))
(Intercept) qtrqtr2 qtrqtr3 qtrqtr4
42 -5 -5 -14
Why does it look like that when -1 is missing from the lm()
call? Where did the intercept for quarter 1 go and why are the other intercepts so much smaller?
2.6.1 Seasonal intercepts written in Form 1
Remembering that lm()
puts models in Form 1, look at the \(\mathbf{Z}\) matrix for Form 1:
= lm(stack.loss ~ -1 + qtr, data = dat)
fit4 = model.matrix(fit4)
Z 1:4, ] Z[
qtrqtr1 qtrqtr2 qtrqtr3 qtrqtr4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
Written in Form 1, this model is \[\begin{equation} \tag{2.21} \begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix} = \begin{bmatrix}1&0&0&0\\ 0&1&0&0 \\ 0&0&1&0\\ 0&0&0&1\end{bmatrix} \begin{bmatrix}\alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \alpha_4 \end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{e} \end{equation}\]
Compare to the model that lm()
is using when the intercept included. What does this model look like written in matrix form?
= lm(stack.loss ~ qtr, data = dat)
fit5 = model.matrix(fit5)
Z 1:4, ] Z[
(Intercept) qtrqtr2 qtrqtr3 qtrqtr4
1 1 0 0 0
2 1 1 0 0
3 1 0 1 0
4 1 0 0 1
2.6.2 Seasonal intercepts written in Form 2
We do not need to add 1s and 0s to our \(\mathbf{Z}\) matrix in Form 2; we just add subscripts to our intercepts matrix like we did when we had north-south intercepts. In this model, we do not have any explanatory variables so \(\mathbf{Z}\mathbf{x}\) does not appear. \[\begin{equation} \tag{2.22} \begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix} = \begin{bmatrix} \alpha_1\\ \alpha_2\\ \alpha_3\\ \alpha_4 \end{bmatrix} + \begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{a}+\mathbf{e} \end{equation}\]