next up previous
Next: About this document ...

Lecture 17: Limiting Distributions
We have seen that the distribution of a statistics based on a random sample of size k depends on k. In some cases we have been fortunate to be able to compute the distribution of the statistic in terms of k. As we have noted, these are rare circumstances indeed, and frequently depended on very delicate calculations. Alas, in most cases such calculations are not feasible. However, frequently we can make approximations to the exact distributions. These approximations frequently take the following form: Suppose that Fk is the distribution of the statistic based on a random sample of size k. Then there is a distribution function F so that for every x,

\begin{displaymath}
\lim_{k\rightarrow\infty}F_k(x) = F(x)\end{displaymath}

You should be somewhat concerned that, as we have put it, there is no description of how large k should be to guarantee that Fk(x) is ``close enough'' to F(x). On the other hand we cannot do everything at once. As an analogy, remember that in calculus, we content for quite a while to say that the fact that

\begin{displaymath}
\lim_{\theta\rightarrow 0}\frac{\sin(\theta)}{\theta} = 1\end{displaymath}

was good enough to say that $sin(\theta) \approx \theta$ for small $\theta$before deriving Taylor's theorem to quantify ``$\approx$'' and ``small''.

The most famous of these theorems is the Central Limit Theorem. It says:
Central Limit Theorem: Suppose that F is a distribution with finite mean $\mu$ and finite variance $\sigma^2$. Let V(k) be a random sample of size k from the distribution F. Let x be any real number. Then  
 \begin{displaymath}
\lim_{k\rightarrow\infty}
\Pr\left(\frac{(V_1^{(k)} + \cdots...
 ...\right)
=
\int_{-\infty}^x \frac{\exp(-u^2/2)}{\sqrt{2\pi}}\;du\end{displaymath} (1)

Since the sum of expectations is the expectation of the sum, and for random samples, the variance of the sum is the sum of the variances, we can rewrite (1) is a couple of ways:  
 \begin{displaymath}
\lim_{k\rightarrow\infty}
\Pr\left(\frac{(V_1^{(k)} + \cdots...
 ...\right)
=
\int_{-\infty}^x \frac{\exp(-u^2/2)}{\sqrt{2\pi}}\;du\end{displaymath} (2)
and  
 \begin{displaymath}
\lim_{k\rightarrow\infty}
\Pr\left(
\frac{\sqrt{k}}{\sigma}
...
 ...right) = \int_{-\infty}^x \frac{\exp(-u^2/2)}{\sqrt{2\pi}}\;du.\end{displaymath} (3)
It is important to realize that (1), (2) and (3) say exactly the same thing, and they are derivable from one another by algebra alone.

We have seen various examples in class and on the homework which suggest that the Central Limit Theorem is true. We will give some further evidence a little while later, and give a proof in the very special case where the random samples are from binomial distributions with p = 1/2 and N=1. This turns out to be a very important case both for theoretical and practical reasons.

At this juncture it would be good to point out that we do have an error approximation theorem, akin to the error term in Taylor's theorem:
The Berry-Esseen Theorem: Suppose that $R_1, \dots, R_k$ are independent identically distributed random variables with

Let x be any real number. Then  
 \begin{displaymath}
\left\vert\Pr\left(\sqrt{k}\frac{(V_1^{(k)} + \cdots + V_k^{...
 ...sqrt{2\pi}}\;du.\right\vert \leq
\frac{3\rho}{\sigma^3\sqrt{k}}\end{displaymath} (4)
Proofs of the Central Limit Theorem and the Berry-Esseen Theorem may be found in advanced texts on probability theory, such as An Introduction to Probability Theory and Its Applications, Volume II by William Feller.

An example
Suppose that we are sampling from the continuous uniform distribution on [-1, 1]. Then

\begin{displaymath}
\mu = \int_{-1}^1 u (1/2)\;du = 0,\end{displaymath}

\begin{displaymath}
\sigma^2 = \int_{-1}^1 (u-0)^2 (1/2)\;du = 1/3;\end{displaymath}

and

\begin{displaymath}
\rho = \int_{-1}^1 \vert u-0\vert^3 (1/2)\;du = \int_0^1 u^3\;du = 1/4;\end{displaymath}

and the Berry-Esseen Theorem says  
 \begin{displaymath}
\left\vert\Pr\left(\sqrt{k}\frac{(V_1^{(k)} + \cdots + V_k^{...
 ...2\pi}}\;du.\right\vert \leq
\frac{3(1/4)}{(1/3)^{3/2}\sqrt{k}}.\end{displaymath} (5)

Suppose now, for the sake of convenience, that k = 100. Since

\begin{displaymath}
\Pr\left(\frac{V_1 + \cdots + V_{100}}{100} \in [0,1/2]\righ...
 ...\left[0,\frac{\sqrt{100}}{\sqrt{1/3}}\frac{1}{2}\right]\right),\end{displaymath}

it follows from the central limit theorem, in the form of (3), that following approximation:

\begin{displaymath}
\Pr\left(\frac{V_1 + \cdots + V_{100}}{100} \in [0,1/2]\righ...
 ...\approx
\int_0^{5\sqrt{3}}\frac{1}{\sqrt{2\pi}}\exp(-u^2/2)\;du\end{displaymath}

There are many other limiting distribution theorems. We have observed that as the degrees of freedom goes to infinity, the t-distribution converges to the standard normal distribution as well.

Here is one more example. Suppose that V(k) is a random sample of size k from a distribution which is concentrated on the positive real numbers. Now let the the statistic T(V) = max(V1(k),...,Vk(k)). We have already seen that if the commond distribution function for these random variables is F that

\begin{displaymath}
\Pr(T(V^{(k)}) \leq x) = F(x)^k.\end{displaymath}

We immediately see that if F(x) < 1 then

\begin{displaymath}
\lim_{k\rightarrow\infty}\Pr(T(V^{(k)}) \leq x) =
\lim_{k\rightarrow\infty}F(x)^k = 0.\end{displaymath}

The problem is to find a sequence ak and a distribution function G so that

\begin{displaymath}
\lim_{k\rightarrow\infty}\Pr\left(\frac{T(V^{(k)})}{a_k} \leq x\right) =
\lim_{k\rightarrow\infty}F(a_kx)^k = G(x).\end{displaymath}

It is difficult to answer this question in general, so we will content ourselves with an examples.

First suppose that F(x) = 1 - (1/x), $x\geq 1$ and 0 otherwise. Then if x > 1/ak, we would have

\begin{displaymath}
F(a_kx)^k = \left(1 - \frac{1}{a_kx}\right)^k\end{displaymath}

which suggests that we should take ak = k to get for all x > 0

\begin{displaymath}
\lim_{k\rightarrow\infty}F(kx)^k = \left(1 - \frac{1}{kx}\right)^k =
\exp(-1/x). \end{displaymath}

It is straightforward to check that if we define $G(x) = \exp(-1/x)$ for x > 0 and 0 otherwise that G is a distribution function.

In the case of a sample from an exponential distribution rescaling is not sufficient. In this case we might have $F(x) = 1-\exp(-x)$ for x > 0 and 0 otherwise. Then we can easily check that for $x \gt -\log(k)$ we have

\begin{displaymath}
\Pr(T(V^{(k)}) \gt x + \log(k)) = \left(1 - \frac{\exp(-x)}{k}\right)^k\end{displaymath}

so for any $x\in (-\infty,\infty)$,

\begin{displaymath}
\lim_{k\rightarrow\infty}\Pr(T(V^{(k)}) \gt x + \log(k)) = \exp(-\exp(-x))\end{displaymath}

and $G(x) = \exp(-\exp(-x))$ is a distribtution function.



 
next up previous
Next: About this document ...
Eric S Key
11/10/1998