5.7 Solving the Regression Equation
In practice, we would like to know the contents of (i.e., solve for) \(\boldsymbol{\beta}\).
Assuming the model is correct, the expected value of \(\boldsymbol{\epsilon}\) is 0, therefore, \[ \boldsymbol{Y} = \boldsymbol{X}\boldsymbol{\beta}\] Then we just need to solve for \(\boldsymbol{\beta}\). We can think back about some of the matrix operations we discussed earlier.
5.7.1 Matrix Multiplication and Transpose
Our goal is to isolate \(\boldsymbol{\beta}\). One initial idea might be to multiple each side of the equation by \(\mathbf{X}^{-1}\) in an attempt to remove \(\mathbf{X}\) from the right hand side, and isolate \(\boldsymbol{\beta}\). Why won’t this work?
Instead, let’s pre-multiply each side of the equation by \(\boldsymbol{X'}\). This would give us
\[ \boldsymbol{X'}\boldsymbol{Y} = \boldsymbol{X'}\boldsymbol{X}\boldsymbol{\beta} \]
This gets us a quantity, \(\left(\boldsymbol{X'}\boldsymbol{X}\right)\), a square matrix containing information about the relations among the \(\mathbf{x}\)s.
5.7.2 Matrix Inverse
Now, since \(\boldsymbol{X'}\boldsymbol{X}\) is a square matrix and presumabely has an inverse (e.g. no perfect collinearity), we can premultiply both sides by \(\left(\boldsymbol{X'}\boldsymbol{X}\right)^{-1}\), to obtain
\[ \left(\boldsymbol{X'}\boldsymbol{X}\right)^{-1} \left( \boldsymbol{X'}\boldsymbol{Y}\right) = \left(\boldsymbol{X'}\boldsymbol{X}\right)^{-1} (\boldsymbol{X'}\boldsymbol{X}) \boldsymbol{\beta} \] Remembering our assumptions that a matrix multiplied by its inverse equals the identity matrix, \((\boldsymbol{X'}\boldsymbol{X})^{-1} (\boldsymbol{X'}\boldsymbol{X})=\mathbf{I}\) the equation simplifies to
\[ \left(\boldsymbol{X'}\boldsymbol{X}\right)^{-1}\left(\boldsymbol{X'}\boldsymbol{Y}\right) = \boldsymbol{I}\boldsymbol{\beta} \]
or more succinctly
\[ \left(\boldsymbol{X'}\boldsymbol{X}\right)^{-1}\left(\boldsymbol{X'}\boldsymbol{Y}\right) = \boldsymbol{\beta} \] We’ve now isolated the unknowns, \(\boldsymbol{\beta}\) onto one side of the equation and figured out how to use matrix algebra to obtain the regression coefficients. Quite literally, this algebra is what allows for estimation of the parameters when fitting a regression model to data.
We will now work through some practical examples - staying aware that this kind of matrix algebra is being done in the background.