Next: About this document ...
Lecture 13: The Markov Inequality and the Chebychev Inequality
Our calculutions have shown that if V is a random sample of size N from
a distribution with mean
and variance
that
- 1.
- The statistic

is an unbiased statistic for
; - 2.
- The variance of T(V) is given by
![\begin{displaymath}
{\rm Var}[T(V)] = \frac{\sigma^2}{N}\end{displaymath}](img4.gif)
This leads us to believe that as the sample size N grows, the observations of
T(V) should cluster near the mean in the sense that the variance of T(V) is
shrinking and small variance means ``the probability is not too spread out''.
We can try to make this more precise by trying to estimate
for any positive number x. We start with something
simpler.
Suppose that S is a non-negative random variable. Let x be a positive number
and let Ix be the random variable given by the rule

Then it is easy to check that

With a little more work we see that

so that
S - xIx
is a non-negative random variable. Since non-negative random variables have
non-negative expected values (if they have expected values at all), we see that
if the expected value of S exists, then
![\begin{displaymath}
0 \leq E[S-xI_x] = E[S] - xE[I_x] = E[S] - x\Pr(S \geq x)\end{displaymath}](img9.gif)
so
![\begin{displaymath}
\Pr(S \geq x) \leq \frac{E[S]}{x}.\end{displaymath}](img10.gif)
This last inequality, valid when
- 1.
- S is a non-negative random variable;
- 2.
- E[S] is defined;
- 3.
- x is a positive real number;
is called Markov's inequality.
Notice that it can also be used to estimate
since
.
Markov's equality can be used to solve the problem of estimating
since |R-E[R]| is a non-negative random variable
with a finite expected value, giving the estimate
![\begin{displaymath}
\Pr(\vert R - E[R]\vert \geq x) \leq \frac{E[\vert R-E[R]\vert]}{x}.\end{displaymath}](img14.gif)
Since the numerator of the righthand side is not commonly known, we observe
that
![\begin{displaymath}
\{\vert R - E[R]\vert \geq x\} = \{(R - E[R])^2 \geq x^2\}\end{displaymath}](img15.gif)
when x is a non-negative number. So if R has a variance, we get
![\begin{displaymath}
\Pr(\vert R - E[R]\vert \geq x) = \Pr((R - E[R])^2 \geq x^2) \leq
\frac{E[(R-E[R])^2]}{x^2} = \frac{{\rm Var}[R]}{x^2} \end{displaymath}](img16.gif)
This inequality is called Chebychev's inequality and can be used to give
some idea of how likely it is that a random quantity is some number of standard
deviations from it mean value, since
![\begin{displaymath}
\Pr(\vert R - E[R]\vert \geq y\sqrt{{\rm Var}[R]}) \leq \frac{{\rm Var}[R]}{y^2{\rm
Var}[R]} = \frac{1}{y^2}\end{displaymath}](img17.gif)
For example, in the standard scheme of curving grades, each grade range has
width one standard deviation, and the C range is centered at the expected
score. Thus if R represents a grade and S represents a score,
![\begin{displaymath}
\Pr(R = {\rm A\; or\; F}) = \Pr( \vert S - E[S]\vert \geq 3\sqrt{{\rm Var}[S]}/2) \leq
4/9 \end{displaymath}](img18.gif)
However, if the scores are in fact normally distributed, we can calculate that
(with
the standard deviation and
the mean),

However, by using the change of variables
we get

Since if S is normal it is symmetrically distributed about the mean, you would
expect that under such a system not more than about 7 percent of the grades
would be A's and about the same percentage of F's, no matter how well
everyone does!
The implications of Chebychev's inequality for our statistic T(V) are that
![\begin{displaymath}
\Pr(\vert T(V) - \mu\vert \gt x) \leq \frac{{\rm Var}[T(V)]}{x^2} = \frac{\sigma^2}{Nx^2}\end{displaymath}](img23.gif)
so the probability that T(V) is more than x units from
decreases
at least linearly in the sample size for each x.
Next: About this document ...
Eric S Key
10/9/1998