1/31
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Purpose of a test of significance
To decide whether an observed sample effect could plausibly be due to chance alone or is good evidence of a real effect in the population.
Playground logic of significance tests
If an outcome would be very unlikely when a claim is true, observing that outcome is good evidence the claim is not true.
Population vs sample in tests
Statistical tests use sample data to make inferences about claims concerning population parameters (like population proportions or means).
Null hypothesis (H0) - definition
The claim being tested; usually a statement of "no effect" or "no difference," stated in terms of a population parameter.
Alternative hypothesis (Ha) - definition
The statement we hope or suspect is true instead of H0; it describes the presence of an effect or a difference in the population.
Null hypothesis - notation and example
Null hypothesis is written H0; for the coffee study, H0: p = 0.5, where p is the population proportion preferring fresh coffee.
Alternative hypothesis - notation and example
Alternative hypothesis is written Ha; for the coffee study, Ha: p > 0.5, meaning a majority of coffee drinkers prefer fresh coffee.
One-sided alternative - definition
An alternative hypothesis that specifies a direction, such as Ha: p > p0, or Ha: p < p0,
Two-sided alternative - definition
An alternative hypothesis that only states the parameter differs from the null value, such as Ha: p ≠ p0,
Example - one-sided alternative
In the coffee preference example, Ha: p >0.5, is one‑sided because it looks only for a proportion greater than 0.5.
Example - two-sided alternative
In the working‑through‑college example, Ha: p ≠ 0.7 is two‑sided because it allows for proportions either higher or lower than 0.7.
P-value - definition (words)
The probability, assuming H0 is true, of getting a sample result as extreme or more extreme (in the direction specified by Ha) than the one actually observed.
Interpreting P-value size
The smaller the P-value, the stronger the evidence against H0 provided by the data.
P-value example - coffee study If H0: p = 0.5, is true, the probability of getting 36 or more of 50 people preferring fresh coffee is about 0.001, a very small P-value.
Conclusion from small P-value - coffee
A P-value around 0.001 is strong evidence that a majority of the population prefers fresh coffee (evidence against H0 in favor of Ha).
P-value example - working through college
With H0: p = 0.7, and sample of 325 students (238 working), the P-value is about 0.19.
Conclusion from moderate P-value - working
Since P = 0.19 is not small, we cannot reject the claim that 70% of college students work; data are consistent with H0.
Significance level (α) - definition
A threshold chosen in advance that specifies how small the P-value must be for us to consider the result statistically significant.
Common significance levels
Typical choices are α = 0.05 (5%) and α = 0.01 (1%); α = 0.05 requires evidence so strong it would occur by chance at most 5% of the time if H0 is true.
Statistically significant - definition
A result is statistically significant at level α if the P-value is less than or equal to α.
"Significant" in statistics vs everyday use
In statistics, "significant" means "unlikely to occur just by chance under H0," not "important" or "practically large."
Interpreting P-value cutoffs
Rough guidelines: P < 0.10 shows some evidence; P < 0.05 is moderate evidence; P < 0.01 is strong evidence against H0.
Role of software in tests
In practice, software computes P-values for us; reports often present the P-value so readers can judge significance at different α levels.
Test statistic - definition
A standardized value computed from sample data (often a standard score) used to measure how far the sample result is from the null hypothesis value.
Test statistic and Normal distribution
For large samples, many tests use a Normal distribution for the test statistic and compute P-values as areas under the Normal curve.
Example - pregnancy length test
A test found a test statistic of about −4.85, giving a P-value less than 0.0003, strong evidence that mean pregnancy length is less than 280 days.
Conclusion from very small P-value - pregnancies
A P-value below 0.0003 strongly suggests that the true mean length of a healthy human pregnancy is less than 280 days.
One-sided vs two-sided P-values
For one-sided Ha, "extreme" refers to one tail of the sampling distribution; for two-sided Ha, "extreme" includes both tails.
Link between confidence intervals and tests
Confidence intervals estimate parameters; significance tests assess evidence for or against specific parameter values (often the center of such intervals).
Key question answered by tests
Could the sample effect be a chance accident, or is it good evidence of a real effect in the population?
Statistics in summary - P-values
The P-value is the probability, assuming H0 is true, of seeing a sample outcome as extreme as or more extreme than what we observed, in the direction specified by Ha.
Statistics in summary - significance at 5% level
A sample result is statistically significant at the 0.05 level if it would occur by chance no more than 5% of the time when H0 is true.