The general theory of hypothesis testing consists of extensions of the following basic example.
I have in my hand a coin. I believe that the coin is fair, that is, that each of toss of the coin is as likely to be heads as tails. I will toss the coin 100 times, and record on each toss whether or not I see a head (H) or a tail (T). I will then use the resulting 100-tuple of H's and T's to decide if the coin is fair. In the end, I may have erred in one of two ways:
Some terminology:
Let us try one obvious solution. We know from the law of large numbers that if the tosses of the coin are assumed to be independent that
![]()
![]()
One thing we could do to limit the probability of a Type I error. For argument's sake, suppose we wish to make a Type I error no more than 5% of the time. The probability of a Type I error is the probability of rejecting the null hypothesis when it is true. When it is true, the coin is fair, and D(X) has a binomial distribution with N = 100 and p=1/2, so we want to pick u so that
![]()

The other possibility is to use the Central Limit Theorem to get an approximate value of u, and then check it. The Central Limit Theorem tells us that
![]()
![]()
![]()
![]()
What then about the Type II error? We have been very non-specific about what the alternative is to our null hypothesis. The alternative the the null hypothesis is called the alternative hypothesis. A hypothesis (null or alternative) is called simple if it contains one element, and it is called composite if it contains more than one element.
If we have no idea at all, the null hypothesis would be that the probability of heads is not 1/2. Maybe we have some additional information, such as that if the probability of heads is not 1/2 then it is less than 1/2. Maybe we know that if it is not 1/2, it is 1/4. In every case the probability of making a Type II error is a function of the elements of the alternative hypothesis.
For example, if the true value of heads is 1/4, then D(X) is binomial with N=100 and p=1/4. The probability of making a Type II error if this is the case is


Let p be the probability of heads. It may have occured to you that if
the null hypothesis, H0 is
and the alternative hypothesis,
H1 is
or
that we should only reject the
null hypothesis if D(X) is too small, as getting 75 heads, say, is more
indicative of H0 than of H1. So as you can see, the problem of designing
a test can be quite complicated. One of our goals will be to develop
systematic methods for finding good tests. Our first problem will be to decide
what makes a test ``good''.