## 2.7 Seasonal effect plus other explanatory variables*

With our four data points, we are limited to estimating four parameters. Let’s use the full 21 data points so we can estimate some more complex models. We’ll add an owner variable and a quarter variable to the stackloss dataset.

data(stackloss, package = "datasets")
fulldat = stackloss
n = nrow(fulldat)
fulldat = cbind(fulldat, owner = rep(c("sue", "aneesh", "joe"),
n)[1:n], qtr = paste("qtr", rep(1:4, n)[1:n], sep = ""),
reg = rep(c("n", "s"), n)[1:n])

Let’s fit a model where there is only an effect of air flow, but that effect varies by owner and by quarter. We also want a different intercept for each quarter. So if datapoint $$i$$ is from quarter $$j$$ on a plant owned by owner $$k$$, the model is $$$\tag{2.23} stack.loss_i = \alpha_j + \beta_{j,k} air_i + e_i$$$ So there there are $$4 \times 3$$ $$\beta$$’s (4 quarters and 3 owners) and 4 $$\alpha$$’s (4 quarters).

With lm(), we fit the model as:

fit7 = lm(stack.loss ~ -1 + qtr + Air.Flow:qtr:owner, data = fulldat)

Take a look at $$\mathbf{Z}$$ for Form 1 using model.matrix(Z). It’s not shown since it is large:

model.matrix(fit7)

The $$\mathbf{x}$$ will be $$$\begin{bmatrix}\alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \alpha_4 \\ \beta_{1,a} \\ \beta_{2,a} \\ \beta_{3,a} \\ \dots \end{bmatrix}$$$

Take a look at the model matrix that lm() is using and make sure you understand how $$\mathbf{Z}\mathbf{x}$$ produces Equation (2.23).

Z = model.matrix(fit7)