5.6 Regression and Matrix Notation

Now that we have reviewed the assumptions of OLS, let’s return to the linear regression model and translate it into a matrix form.

5.6.1 An Intercept-Only Model

First, let’s take a simpler form of the model, an intercept-only model where \[ y_i = \beta_0 1_{i} + \epsilon_i.\] Note that we have made the “silent” 1 explicit. This will become important later (e.g., when fitting growth models). It is worthwhile to look at regression model without predictors to understand what it can tell us about the nature of the intercept (or constant).

So here we have no predictors, what is \(\beta_0\)?

Here, \(\beta_0\) is the mean of the response variable, and we can show this with some algebra,

\[\mathbb{E}(y_i)=\mathbb{E}(\beta_0 1_{i} + \epsilon_i)=\beta_0 1_{i} +\mathbb{E}( \epsilon_i)=\beta_0\] where \(\mathbb{E}( \epsilon_i)=0\) (Assumption 1).

5.6.2 Intercept-Only Model in Matrix Form

Translating into matrix form, \(y_i\) can be written as an \(N\) x 1 matrix (a column vector). More specifically, for \(i = 1\) to \(N\) individuals, \[ y_i = \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_N \end{array} \right] = \boldsymbol{Y}\].

(Remember, matrices are often designated as bold capital letters)

Doing the same for all the other parts of the model, we get

\[ \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_N \end{array} \right] = \left[ \begin{array}{c} 1 \\ 1 \\ \vdots \\ 1 \end{array} \right] [\beta_0] + \left[ \begin{array}{c} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_N \end{array} \right]\]

Note that we have taken care that each matrix is of an order that will allow for matrix multiplication.

5.6.3 Simple Regression in Matrix Form

Now, let’s expand our regression model by adding a predictor \(x_{1i}\). Our model becomes

\[ y_i = \beta_0 + \beta_1x_{1i} + \epsilon_i \]

Written out explicitly in matrix form, the model is
\[ \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_N \end{array} \right] = \left[ \begin{array}{cc} 1, x_{11}\\ 1, x_{12}\\ \vdots \\ 1, x_{1N}\end{array} \right] \left[ \begin{array}{c}\beta_0\\ \beta_1\end{array}\right] + \left[ \begin{array}{c} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_N \end{array} \right]\]

5.6.4 Multiple Regression in Matrix Form

Finally, extending the model to the general case with \(q\) predictor variables, we have \[ y_i = \beta_0 + \beta_1x_{1i} + \beta_2x_{2i} + ... + \beta_qx_{qi} + \epsilon_i \]

which is written out in matrix form as

\[ \underbrace{\left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_N \end{array} \right]}_{N \times 1} = \underbrace{\left[ \begin{array}{cccc} 1, x_{11}, \ldots, x_{q1}\\ 1, x_{12}, \ldots, x_{q2}\\ \vdots \\ 1, x_{1N}, \ldots, x_{qN}\end{array} \right]}_{N \times (q + 1)} \underbrace{\left[ \begin{array}{c}\beta_0\\ \beta_1\\ \vdots\\ \beta_q\end{array}\right]}_{(q+1) \times 1} + \underbrace{\left[ \begin{array}{c} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_N \end{array} \right]}_{N \times 1}\]

Where we have the following elements:

\[ \boldsymbol{Y} = \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_N \end{array} \right] ;\boldsymbol{X} = \left[ \begin{array}{cccc} 1, x_{11}, \ldots, x_{q1}\\ 1, x_{12}, \ldots, x_{q2}\\ \vdots \\ 1, x_{1N}, \ldots, x_{qN}\end{array} \right]; \boldsymbol{\beta} = \left[ \begin{array}{c}\beta_0\\ \beta_1\\ \vdots\\ \beta_q\end{array}\right]; \boldsymbol{\epsilon} = \left[ \begin{array}{c} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_N \end{array} \right] \]

Observe the order of the matrices/vectors. On the right hand side you are matrix multiplying a \(N \times (q+1)\) matrix with a \((q+1) \times 1\) vector. This yields an \(N \times 1\) vector, to which another \(N \times 1\) vector \(\boldsymbol{\epsilon}\) is added, and this is equal to our outcome vector \(\boldsymbol{Y}\) which is also \(N \times 1\).

When we implement this model in R, it will be important to know the portions of the model that are in our data frame, \(y_i\) and \({x_{1}, ..., x_{q}}\), and to have them structured properly. This will become clear in the examples below.

Now that we have the model written out explicitly as matrices, we can easily simplify the notation.

In compact matrix notation, the regression model then can be written as

\[ \boldsymbol{Y} = \boldsymbol{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]