Next: About this document ...
Lecture 6: Random variables
Having performed an experiment, be it tossing a coin, drawing card from a deck,
or whatever, it is frequently convenient to quantize the outcome, that is,
associate to the outcome a number. One might want to count the number of heads
obtained, or the number of queens. In other words, one wants to assign a
number (sometimes a complex number or a vector) to each outcome. All this means
is that one wants a function whose domain is the sample sample space and whose
range is a convenient set, like the real numbers, the complex numbers, or
perhaps Rn.
For now, let us consider functions which take values in the real numbers. (The
other cases are so similar that we will take for granted that their behavior is
similar. ) Our goal for using functions is that they can be used to define
events. In the coin tossing example, our function might count the number of
heads. Call this function R. We can look at the set

If we have chosen the set of events to contain all subsets of
, then
this set is an event, and we can ask for the probability of
. Usually
we get really lazy with the notation, and ask for the probability of R=6.
The important thing is that there is a relation between R and the sigma
algebra. The precise relation is that if the model is
and
, then for every interval
,
|  |
(1) |
so that it makes sense to speak of

Functions which satisfy (1) are called (real valued) random
variables.
Whether or not a function is a random variable depends on the sigma
algebra! You should be aware of this point. Here are two examples to drive
home the point. The only difference is in the sigma algebra.
- Example 1:
,
is all subsets of
,
the number of elements of A divided by 3. R(x) = x.
- Example 2:
,
,
the number of elements of A
divided by 3. R(x) = x.
In example 2 R is not a random variable since
is not in
. In example 1 R is a random variable since for every interval
the set
is a subset of
, and all subsets of
are in
.Unless otherwise mentioned, you may assume that all functions we discuss
are random variables!!
Random variables are classified into subtypes. Some important ones to mention
at the outset are:
- Indicator random variables:
- These are random variables which take on
only the value 0 or 1, so they ``indicate'' whether or not something has
happened. For example if A is an event, we can define a random variable,
IA, called the indicator of A by the rule

- Simple random variables:
- These are random variables whose range contains only a finite number of
elements. For example, if we model the result of tossing a coin 10 times,
and R counts the number of heads, then R is a simple random variable.
- Lattice random variables:
- A lattice (in the real numbers) is a set of
the form

For example, the integers are a lattice, as are all the multiples of 1/3. The
set of reciprocals of all integers is not a lattice. A lattice random
variable is a random variable whose range lies in a lattice. The lattice
random variables we usually will be concerned with are random variables whose
range is a subset of the integers.
Note: Not all simple random variables are lattice random variables.
Consider, for example, a random variable R whose range is
.
- Discrete random variables:
- A discrete random variable is one whose
range is a set which is countable. The name is unfortunate, for it does not
imply that the range is a discrete set. Indicator random variables, simple
random variables and lattice random variables are all discrete random
variables, but so is a random variable whose range is all of the rational
numbers.
There is one important link between general discrete random variables,
indicator random variables, and partitions of the sample space: Suppose that
R is a discrete random variable whose range is the (countable) set Y. Let

The fact that R is a function whose domain is
guarantees that
The fact that R is a random variable means each of the sets By is in the
sigma algebra.
Now, if we let Iy be the indicator of the event By we have

Probability Mass Functions
To each random variable R we may assign a function, pR, from the real
numbers to the interval [0,1] by the rule

If x is not in the domain of R then pR(x) = 0. If x is in the domain
of R then pR(x) may or not may not be . pR is called the
probability mass function of R. Probability mass functions are
important for discrete random variables. In fact, many lattice random
variables are named by the form of their probability mass functions.
Expected value
To each discrete random variable, R, with range Y we may attempt to assign a
real number, called the expected value of R, denoted E[R], by the rule
![\begin{displaymath}
E[R] = \sum_{y\in Y} y\Pr(R = y) = \sum_{y\in Y}yp_R(y).\end{displaymath}](img24.gif)
The problematic part of this is that the above mentioned sum might not be
convergent. When it is not, we say that R does not have an expected value.
The following theorem is quite useful. It is sometimes called the Law of
the Unconscious Statistician.
Suppose that
is a function and that R
is a discrete random variable with probability mass function pR(y) and range
Y. Then H(R) is a discrete random variable.
Furthermore, if

then E[H(R)] is a real number and
![\begin{displaymath}
E[H(R)] = \sum_{y\in Y} H(y)\cdot p_R(y)\end{displaymath}](img27.gif)
We shall be most interested in applying this theorem in the case where
and
, that is, to compute the quantity
E[(R-E[R])2], which is called the variance of R.
Independent random variables
Two (real valued) random variables R and S are called independent if for
any intervals
and

Mutually independent sets of random variables are defined analogously to
mutually independent collections of sets. An important result is that if R
and S are independent discrete random variables and E[R] and E[S] are
defined, then so is E[RS] and E[RS] = E[R]E[S]. This follows from the Law of
the Unconscious Statistician along with the observation that if R and S are
independent then

We simply apply the Law to the vector random variable (R,S) and the function
H((R,S)) = RS.
Mean and variance of a sum
Notice that if V is any vector valued random variable with V = (V1,V2),
the components of V are also random variables, and if V is discrete if and
only if its components are discrete. If we denote the range of the j'th
component by Yj, then it follows from the law of total probability that

It then follows from the Law of the Unconscious Statistician that if E[V1]
and E[V2] are real numbers and c is any constant, then
E[V1 + aV2] = E[V1] + cE[V2].
Here we take H(V) = V1 + cV2.
Since any pair of random variables may be regarded as the components of a
vector valued random variable, the preceding holds for all pairs of discrete
random variables.
If we know that the components are independent with real variances then we get
![\begin{displaymath}
\begin{array}
{rcll}
Var[V_1 + V_2]
& = &
E[((V_1 + V_2) - ...
...{\rm Add'n\;prop.\;of\;E}\ &&&{\rm for\;ind.\;rv's}\end{array}\end{displaymath}](img34.gif)
Next: About this document ...
Eric S Key
9/25/1998