Study Notes on Null Hypothesis Significance Testing (NHST)
Null Hypothesis Significance Testing (NHST)
Introduction to NHST
NHST is a method of statistical inference used to test an alternative hypothesis against a null hypothesis, which assumes no effect.
This is based on some pre-specified observations made during the study.
Preview of NHST Discussion
Logic of Null Hypothesis Significance Testing
Steps in Null Hypothesis Significance Testing
Understanding NHST
The concept is framed around the idea that we never “prove” a hypothesis.
Instead, we:
Find evidence against the null hypothesis (H0)
Fail to reject the null hypothesis (H0)
We select an arbitrary probability value based on confidence intervals that reflects how comfortable we are with the chance of being incorrect.
Commonly set at 5%.
A statistical value outside the cutoff is termed statistically significant.
Types of Tests in NHST
One-tailed vs. Two-tailed Tests
Two-tailed test: Evaluates both ends of the distribution.
Significance levels: normal critical cutoffs (e.g., 0.025 on each tail)
Example critical points for a two-tailed test at 95% probability level:
-1.96 (lower tail)
1.96 (upper tail)
One-tailed test: Focuses only on one tail of the distribution.
Example significance level: 0.05
Critical point: 1.645 (if testing for an increase and directional).
Example: Single-Sample t-test
For the GRE Verbal test:
Population mean (μ) = 465 (based on 1.2 million test takers)
Sample mean (M) for COM Grad student applicants in Fall 2019:
M = 551, SD = 92, N = 26 applicants
Research Question: Are COM students applying to our graduate program representative of the larger population of GRE test takers?
Steps in Hypothesis Testing
Formulate your research hypothesis: A tentative statement about a relationship between two or more variables.
Research hypothesis example (H1): Students admitted into the COM graduate program have higher GRE scores than the general population.
Formulation: H1: M > μ, indicating COM graduate students are not representative of GRE test takers.
Formulate the null hypothesis (H0): Represents the hypothesis of no difference or no association.
H0 example: M = μ or μ - M = 0.
Implication: Any observed difference is simply sampling error.
Assess the probability of observing a sample with N = 26 at M = 551 from a population with μ = 465.
Select the appropriate inferential statistic: Select a test based on your hypothesis (in this case, a one-sample t-test).
Calculate the inferential statistic: Use the formula to compute the t-statistic.
Example calculation:
Standard Error of the Mean (SEM): SEM = \frac{92}{\sqrt{26-1}} = \frac{92}{5} = 18.49
t statistic: t = \frac{(551 - 465)}{18.49} = \frac{86}{18.49} = 4.67
Note: A positive t-test value indicates expectation of M > μ; vice versa.
Determine degrees of freedom (df): For a one-sample t-test, df = N - 1.
Example: 26 - 1 = 25.
Select a level of statistical significance: This indicates the likelihood that the null hypothesis is true given the data.
Conventional thresholds include:
p < 0.05: Observed difference is significant at this level.
Interpretation: A difference this size is unlikely to occur due to chance alone.
p < 0.05 implies that the observed mean (551) is unlikely to be derived from a distribution with μ = 465.
Determine Critical Value: This is the cut-off point in the sampling distribution. It marks the threshold for statistical significance.
For instance, at df = 25 and p < 0.05, find the critical t-value from a t-table.
Compare test statistics to critical value: Make decisions about the hypotheses.
If t > cv: the difference is statistically significant.
If t < cv: the difference is not statistically significant.
Example comparison:
t-test = 4.67, critical value (cv) = 2.06.
Results indicate rejection of H0 if t > cv.
Probability that H0 is true less than 5% indicates support for H1.
Ensure the expected direction aligns with the hypothesis.
Conceptual Understanding of Statistical Significance
Reflect on what it means for the difference between M and μ to be statistically significant.
Consider whether we can be certain that UM grad students are representative of GRE test takers.
Discuss the comparison between a one-sample t-test and a z-score, focusing on similarities and differences in application and interpretation of results.