Hypothesis Testing

The four steps of hypothesis testing are (1) state the hypotheses; (2)

compute the test statistic; (3) determine the P-value ; and (4)

draw the appropriate conclusions.

Hypothesis testing uses data to decide whether a parameter equals the

value stated in a null hypothesis. If the data are too unusual, assuming

the null hypothesis is true, then we reject the null hypothesis.

The null hypothesis (H0) is a specific claim about a parameter.

The null hypothesis is the default hypothesis, the one assumed to be

true unless the data lead us to reject it. A good null hypothesis would

be interesting if rejected.

The alternative hypothesis (HA) usually includes all values for

the parameter other than that stated in the null hypothesis.

The test statistic is a quantity calculated from data, used to evaluate

how compatible the data are with the null hypothesis.

The null distribution is the sampling distribution of the test statistic

under the assumption that the null hypothesis is true.

The P-value is the probability of obtaining a difference from

the null expectation as great as or greater than that observed in the

data if the null hypothesis were true. If P is less than or equal to α ,

then H0 is rejected.

The threshold α is called the significance level of a test. Typically, α

is set to 0.05.

The P-value is not the probability that the null hypothesis is

true or false.

The P-value reflects the weight of evidence against the null

hypothesis, but P does not measure the size of the effect. Use confidence intervals to put bounds on the magnitude of effect.

A Type I error is rejecting a true null hypothesis. A Type II error is

failing to reject a false null hypothesis:

The probability of making a Type I error is set by the significance

level, α. If α=0.05 , then the probability of making a Type I

error is 0.05.

The power of a test is the probability that a random sample, when

analyzed, leads to rejection of a false null hypothesis.

Increasing sample size increases the power of a test.

Failure to reject the null hypothesis is usually inconclusive about

whether the hypothesis is true or false.

In a two-sided test, the alternative hypothesis includes parameter

values on both sides of the parameter value stated by the null

hypothesis. In a one-sided test, the alternative hypothesis includes

parameter values on only one side of the parameter value stated by the

null hypothesis.

Most hypothesis tests are two-sided. One-sided tests should be

restricted to rare instances in which a parameter value on one side of

the null value is inconceivable.

When the results are given for a hypothesis test, they should be

accompanied by a confidence interval for the relevant parameter,

whenever possible.