next up previous
Next: About this document ...

Lecture 31: Criteria for Estimation
We have seen that there are several ad hoc methods for estimating parameters: unbiased estimation, method of moments and maximum likelihood. There are more, but the more important consideration for now is how to evaluate the performance of an estimator. There are several criteria, and we shall look at them all briefly. First, a few ground rules and definitions.

We shall let $\Theta \subset R^d$ be a set containing the parameters of interest. In almost all cases, we will have d =1 or d =2, but we have seen, for example in the case of bivariate normal distributions, that d can be much larger. We will also assume from now on that in any model we consider that either

We shall call such models Regular Models.

Recall that if the random vector $\vec{X}$ represents our observations, and T is a function of $\vec{X}$, we call T or $T(\vec{X})$ a statistic.

If $\vec{X}$ comes from the regular model $\{P_\theta, \theta \in \Theta]\vert$, we say that T is sufficient for $\theta$ if the conditional distribution of $\vec{X}$ given $T(\vec{X})$ is independent of $\theta$.

Exponential families provide an instant source of sufficient statistics, since the formula

\begin{displaymath}
p(\vec{x},\theta) = \exp\left(\sum_{k=1}^d C_k(\theta)\sum_{j=1}^NT_k(x_j) + \sum_{j=1}^NS(x_j)
+ ND(\theta)\right) \end{displaymath}

for the joint density/mass function of the random sample $\vec{X} = (X_1,\dots,
X_N)$ suggests that the vector-valued function

\begin{displaymath}
T(\vec{x}) = (\sum_{j=1}^N T_1(x_j),\dots,\sum_{j=1}^NT_d(\vec{x}))\end{displaymath}

is sufficient for $\theta\in\Theta\subset
R^d$. In the case of a regular model which is discrete, this is easy to prove by direct computations with the definition of conditional probability.

In fact, there is the following theorem:

Theorem 208

In a regular model, a statistic T(X) with range $\cal I$ is sufficient for $\theta$ if and only if there exists a function $g(t,\theta)$ defined for each $t\in {\cal I}$ and each $\theta\in\Theta$ and a function h defined on RN such that

\begin{displaymath}
p(\vec{x},\theta) = g(T(\vec{x}),\theta)h(\vec{x})\end{displaymath}

In the discrete case the proof is straightforward. For the general case, see Testing Statistical Hypotheses by E. L. Lehmann.


 
next up previous
Next: About this document ...
Eric S Key
3/11/1999