Next: About this document ...
Lecture 33: The General Linear Model
Suppose that
is a random vector with mean zero
components, that
is a rank k matrix of real numbers,
and
. We wish to consider the random vector
defined by
|  |
(1) |
The model described by (1) is called the General Linear Model.
We regard B as known, and we would like to both estimate
and test
hypotheses about
in this model.
This model arises in many contexts, many of which are linked by the common
thread that the matrix B is input to a system whose output under optimal
conditions would be
. However, our measurements of the output,
Yi, are contaminated by random errors,
.
For example, the time it takes for an object to fall d units is assumed to
follow a law of the form

Objects are dropped from various heights,
, and the time
measurements are contamininated by random errors
, so that the
time measurement, Ti, follows the law

Another example. The temperature at the location (x,y) in the plane is
assumed to satisfy the quadratic relation
T = Ax2 + 2Bxy + Cy2 + Dx + Ey + F.
The measurement at (xi,yi) is contaminated by the random error
so we have

We predicate our search for a good estimator on the requirement that the
estimator should minimize the sum of squares

However, this just says that we should choose
so that
is
the projection of
onto the span of the columns of B. Therefore,
we want the columns of B to be perpendicular to
, that is

This is equivalent to
. Since we assume that the
rank of B is k, BtB is invertible, and we have
|  |
(2) |
would minimize the sum of squares. Let us now look at how the statistical
properties of
determine the statistical properties of
. First note that
|  |
(3) |
Since we assume that the components of
are mean zero, we see
Proposition 584
is an unbiased estimate of
in (1).
Next, suppose that the components of
have finite variances.
Then we can define the covariance matrix of the errors,
, by
![\begin{displaymath}[\Sigma]
_{i,j} = {\rm E}[\epsilon_i\epsilon_j]\end{displaymath}](img25.gif)
and compute the covariance of
in terms of
and B:

since BtB is symmetric! Therefore,

If, for example the errors are all independent with the same variance
then we achieve a further simplification:
Proposition 599
Suppose that the
are uncorrelated with common
variance
. Then
and

A natural thing to do to improve an estimate is to take more observations.
Since the observations we take are reflected in the rows of B, let us now
assume that S is a
matrix of rank k and that S determines B
in the sense that B is composed of n
blocks, all equal to S.
Thus
![\begin{displaymath}
B^t = \left[S^t,S^t,\dots,S^t\right],\end{displaymath}](img32.gif)
and so
BtB = nStS.
Suppose we regard
also as having n blocks, each of length
p, so that
![\begin{displaymath}
\vec{\epsilon}^t = \left[\vec{v_1}^t,\dots,\vec{v_n}^t\right].\end{displaymath}](img33.gif)
We have that StS is invertible and so
|  |
(4) |
so that if the vectors
are independent and identically distributed
then with probability 1,

so with probability 1,
converges to
, just like
converges to the mean with probability 1 for a random sample.
If we know that the components of
are uncorrelated and have
common variance
then this block form of B reduces the formula for
the covariance of
to

and we see that
![\begin{displaymath}
{\rm E}[((\hat{A})_1 - (\vec{A})_1)^2 + \cdots + ((\hat{A})_k - (\vec{A})_k)^2]
=
\frac{\sigma^2}{n}{\rm tr}((S^tS)^{-1}).\end{displaymath}](img39.gif)
Of particular interest is the error in our fit, that is, the quantity

which is commonly called the Residual Sum of Squares. If we assume that
the components of
are a random sample of size n from the
standard normal distribution, we can compute the probability distribution of
the residual sum of squares, RSS. First, observe that

Despite its formidable appearance, the matrix
has
a very simple structure.
- 1.
- Since BtB is symmetric, so is M. Therefore this are matrices
and D where D is diagonal,
and
; - 2.
- M2 = M by direct calculution.
- 3.
- MB = [0], so each column of B is an eigenvector of B for the
eigenvalue 0. Since B has rank k, there are at least k 0's on the
diagonal of D.
- 4.
- There are n-k linearly indendendent vectors
in Rn which are
perpendicular to each of the columns of B. For each
,
, so each of these n-k vectors is an eigenvector of for M for the
eigenvalue 1. Hence D has at least (n-k) 1's on the diagonal. Since D
is
, this means it has n-k 1's and k 0's on the diagonal. We
will assume that the 1's come first, that is [D]j,j = 1 if
and [D]j,j = 0 if
.
This tells us that

Since
is invertible,
has a multivariate
normal distribtion. It is clear that its mean vector is
. As for the
covariance matrix,
![\begin{displaymath}
{\rm Cov}({\cal O}\vec{\epsilon})
=
{\rm E}[{\cal O}\vec{\epsilon}\vec{\epsilon}^t{\cal O}^t]
=
{\cal O}{\cal O}^t = I,\end{displaymath}](img54.gif)
so
is also a random sample of size n
from the standard normal distribution. This shows us that

which shows us that RSS has a chi-square distribution with n-k degrees of
freedom!
Next: About this document ...
Eric S Key
4/3/1999