next up previous
Next: About this document ...

Lecture 21: Conditional Distributions

Suppose that X and Y are random variables on the same probability space, $(\Omega, {\cal F}, \Pr)$and that Y is a discrete random variable. If F is any event, if y is in the range of Y we have defined $\Pr(F \vert Y = y) = \Pr(F \cap \{Y =
y\})/\Pr(\{Y = y\})$. For the moment, let $Q_y(F) = \Pr(F \vert Y = y)$. It is easy to check that Qy is another probability measure on $\cal F$.Therefore, the function $F_{X\vert Y=y}:(-\infty,\infty)\rightarrow [0,1]$ given by

\begin{displaymath}
F_{X\vert Y=y}(t) := Q_y(\{X \leq t\}) = \Pr(X \leq t \vert Y = y)\end{displaymath}

is a distribution function. We call FX|Y=y the conditional distribution function of X given Y = y. Since expected values of random variables may be computed from distribution functions, we may define the conditional expected value of X given Y = y to be the expected value of X as computed with FX|Y=y (as opposed to the natural distribution function of X). We denote this conditional expectation by E[X | Y = y].

Observe that if X and Y are independent then

\begin{displaymath}
F_{X\vert Y=y}(t) = \Pr(X \leq t \vert Y = y) = \Pr(X \leq t)\end{displaymath}

so if X and Y are independent and X has an expected value, then E[X | Y = y] = E[X].

Also, if F is an event, and X = IF, then FX|Y=y is a discrete distribution, and

\begin{displaymath}
E[X \vert Y = y] = E[I_F \vert Y = y] = 1\times\Pr(I_F = 1 \vert Y = y) = \Pr(F \vert Y= y),\end{displaymath}

showing that conditional expectation is an extension of the idea of conditional probability.

Next, suppose that X also has a discrete distribution. Then the conditional distribution function FX|Y=y is also discrete with the obvious mass function, $\Pr(X = x \vert Y = y)$. We denote this conditional mass function by pX|Y=y, and we have the following theorem:

Theorem: Suppose that X and Y are discrete random variables defined on the same probability space, H is a function such that E[H(X)] is defined, and $\Pr(Y = y) \gt 0$. Then

\begin{displaymath}
E[ H(X) \vert Y = y] = \sum_{x} H(x) p_{X\vert Y=y}(x) = \sum_{x}H(x)
\frac{\Pr(X = x, Y = y)}{\Pr(Y = y)},\end{displaymath}

and

\begin{displaymath}
E[H(x)] = \sum_{y}E[H(X) \vert Y = y]\Pr(Y = y).\end{displaymath}

If we think of E[X | Y = y] as a function of y, call it $\Phi(y)$, then the second part of the theorem may be written as

\begin{displaymath}
E[H(X)] = E[\Phi(Y)].\end{displaymath}

Typically, the expression E[H(X) | Y] is used instead of $\Phi(Y)$, and we would write

E[H(X)] = E[E[H(X) | Y]].

This formula is usually referred to as the Law of Iterated Expectations. The whole of the modern theory of conditional expectation revolves around trying to define the quantity E[X | Y] for any random variables, rather than the case described above, where X and Y are assumed to be discrete.

It is quite typical to be given not the joint distribution of X and Y, but instead the distribution of Y and the conditional distribution of X given Y = y.

For example: Suppose that Y has a geometric distribution on $\{1, 2, \dots\}$, that is $\Pr(Y = y) = (1-p)p^{y-1}$, $y = 1, 2,
\dots$, and the distribution of X given Y = y is Poisson with mean y. Find the expected value of X and the probability that X = 7.

Solution. Since the distribution of X given Y = y is Poisson with mean y, we have E[X | Y = y] = y. Therefore

\begin{displaymath}
E[X] = \sum_{y=1}^\infty y \Pr(Y = y) = \sum_{y=1}^\infty y (1-p)p^{y-1} =
\frac{1}{1-p}\end{displaymath}

and

\begin{displaymath}
\Pr(X = 7) = \sum_{y=1}^\infty \frac{y^7\exp(-y)}{7!}\Pr(Y = y)
=
\sum_{y=1}^\infty \frac{y^7\exp(-y)}{7!}(1-p)p^{y-1}\end{displaymath}



 
next up previous
Next: About this document ...
Eric S Key
12/4/1998