Hypothesis Testing
The four steps of hypothesis testing are (1) state the hypotheses; (2)
compute the test statistic; (3) determine the P-value ; and (4)
draw the appropriate conclusions.
Hypothesis testing uses data to decide whether a parameter equals the
value stated in a null hypothesis. If the data are too unusual, assuming
the null hypothesis is true, then we reject the null hypothesis.
The null hypothesis (H0) is a specific claim about a parameter.
The null hypothesis is the default hypothesis, the one assumed to be
true unless the data lead us to reject it. A good null hypothesis would
be interesting if rejected.
The alternative hypothesis (HA) usually includes all values for
the parameter other than that stated in the null hypothesis.
The test statistic is a quantity calculated from data, used to evaluate
how compatible the data are with the null hypothesis.
The null distribution is the sampling distribution of the test statistic
under the assumption that the null hypothesis is true.
The P-value is the probability of obtaining a difference from
the null expectation as great as or greater than that observed in the
data if the null hypothesis were true. If P is less than or equal to α ,
then H0 is rejected.
The threshold α is called the significance level of a test. Typically, α
is set to 0.05.
The P-value is not the probability that the null hypothesis is
true or false.
The P-value reflects the weight of evidence against the null
hypothesis, but P does not measure the size of the effect. Use confidence intervals to put bounds on the magnitude of effect.
A Type I error is rejecting a true null hypothesis. A Type II error is
failing to reject a false null hypothesis:
The probability of making a Type I error is set by the significance
level, α. If α=0.05 , then the probability of making a Type I
error is 0.05.
The power of a test is the probability that a random sample, when
analyzed, leads to rejection of a false null hypothesis.
Increasing sample size increases the power of a test.
Failure to reject the null hypothesis is usually inconclusive about
whether the hypothesis is true or false.
In a two-sided test, the alternative hypothesis includes parameter
values on both sides of the parameter value stated by the null
hypothesis. In a one-sided test, the alternative hypothesis includes
parameter values on only one side of the parameter value stated by the
null hypothesis.
Most hypothesis tests are two-sided. One-sided tests should be
restricted to rare instances in which a parameter value on one side of
the null value is inconceivable.
When the results are given for a hypothesis test, they should be
accompanied by a confidence interval for the relevant parameter,
whenever possible.