Notes on Type I/II Errors, Null Hypothesis, and the Normal Distribution

Type I and Type II Errors

Key concepts: Hypotheses, decision rules, and error types in hypothesis testing.
Null hypothesis (H0) and alternative hypothesis (Ha).
Type I error: rejecting H0 when H0 is true (false positive).
- Denoted by $\alpha = P(\text{reject } H0 \mid H0 \text{ true})$ .
Type II error: failing to reject H0 when Ha is true (false negative).
- Denoted by $\beta = P(\text{fail to reject } H0 \mid HA \text{ true})$ (also $\beta = P(\text{not reject } H0 \mid HA \text{ true})$ ).
Power of the test: probability of correctly rejecting H0 when Ha is true.
- $\text{Power} = 1 - \beta = P(\text{reject } H0 \mid HA \text{ true})$ .
Example (from transcript):
- H0: The treatment does not reduce blood pressure.
- Ha: The treatment does reduce blood pressure.
- Type II error occurs if we accept H0 (do not reject) even though the treatment reduces BP.
“Acceptance” rate and Type II error:
- The probability of accepting H0 when Ha is true equals the Type II error rate, i.e., $\beta$ .
Trade-offs and design considerations:
- Lowering the significance level (\alpha) (making tests harder to reject H0) can increase (\beta) for a fixed sample size.
- Increasing sample size can reduce both (\alpha) and (\beta) and increase power.
Quick practical notes:
- The transcript mentions finding the percentage of “acceptances” to relate to Type II error; conceptually, this is the acceptance rate under the alternative, i.e., (\beta).
- A PDF/resource mentioned: there are online PDFs with definitions and examples of the null hypothesis; useful to type up and study the null and its interpretations.

Null Hypothesis and Decision Making

Null hypothesis (H0) is often the statement of no effect or no difference; it is the default assumption.
- The transcript notes: null is usually the negative.
Alternative hypothesis (Ha) expresses the presence of an effect or a difference.
Decision framework:
- Choose a significance level (\alpha) (probability of a Type I error).
- Collect data and compute a test statistic.
- Decide to reject H0 or fail to reject H0 based on the p-value or critical region.
Language of decision:
- It is more precise to say we "fail to reject H0" rather than "accept H0"; however, in practice, some sources summarize as accepting H0 when there is not enough evidence.
In relation to the transcript:
- The null hypothesis is linked to the idea of a negative statement (e.g., no effect), and statistical tests provide a framework to decide against or not against H0.
Practical tips:
- Understanding H0 and Ha helps in designing experiments (sample size, power) and in interpreting results (significance vs. practical importance).

The Normal Distribution (StatQuest overview)

What is the normal distribution?
- A normal (Gaussian) distribution is the bell-shaped, symmetric distribution often used to model many natural phenomena.
- It is centered around its mean; the mean is the location parameter that places the distribution on the axis.
- The y-axis represents the probability density (relative likelihood) of observing a value at x; areas under the curve correspond to probabilities.
Visual intuition using height data (as in the transcript):
- Heights are spread around a central average with most people near the mean and fewer people very short or very tall.
- Two visual examples given:
- Babies (mean roughly (\mu = 20) inches) vs. adults (mean roughly (\mu = 70) inches).
- The baby height distribution is overall narrower (smaller spread) than the adult height distribution.
Center and width:
- Centered on the mean: the peak is at (x = \mu).
- Width is defined by the standard deviation (\sigma): larger (\sigma) = wider spread; smaller (\sigma) = tighter cluster.
- Transcript values:
- Babies: mean (\mu = 20) inches, standard deviation (\sigma = 0.6) inches.
- Adults: mean (\mu = 70) inches, standard deviation (\sigma = 4) inches.
95% rule (as stated in the transcript):
- About 95% of measurements fall between (\mu - 2\sigma) and (\mu + 2\sigma).
- In formula: $P(|X - \mu| \le 2\sigma) \approx 0.95$ when (X\sim N(\mu, \sigma^2)).
- Exact (standard normal) reference: if (Z = \frac{X - \mu}{\sigma} \sim N(0,1)), then $P(-2 \le Z \le 2) \approx 0.9545$ .
Common Normal distribution formulas:
- Probability density function: $f(x|\mu,\sigma^2) = \frac{1}{\sigma \sqrt{2\pi}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right)$
- Standardization to the standard normal: $Z = \frac{X - \mu}{\sigma}, \quad Z \sim N(0,1)$
- Probability for an interval via CDF: $P(a \le X \le b) = \Phi\left(\frac{b - \mu}{\sigma}\right) - \Phi\left(\frac{a - \mu}{\sigma}\right)$ where (\Phi) is the standard normal CDF.
Key concepts and implications:
- Normal distributions are a foundational model in statistics because of the central limit theorem and real-world approximations.
- The narrower the distribution (smaller (\sigma)), the more precise the value is around the mean; wider distributions indicate greater variability.
- The pronunciation and intuition from the transcript emphasize that the distribution’s width reflects the variability of the population (e.g., babies vs. adults).
Connections to broader concepts:
- Z-scores and standardization allow comparing different normal distributions on a common scale.
- In hypothesis testing, test statistics are often assumed to be approximately normal under certain conditions, enabling p-values and decision rules.
Quick references (summary of key formulas):
- Normal distribution: $X \sim \mathcal{N}(\mu, \sigma^2)$
- PDF: $f(x|\mu,\sigma^2) = \frac{1}{\sigma \sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$
- Standardization: $Z = \frac{X-\mu}{\sigma}, \quad Z \sim \mathcal{N}(0,1)$
- Interval probability: $P(a \le X \le b) = \Phi\left(\frac{b-\mu}{\sigma}\right) - \Phi\left(\frac{a-\mu}{\sigma}\right)$
- 95% rule (approximate): $P(|X-\mu| \le 2\sigma) \approx 0.95$
Practical takeaways:
- When you see a bell-shaped curve, expect most data near the mean and gradually fewer data as you move away, with variability captured by (\sigma).
- In experiments, reporting and interpreting results often rely on the mean and standard deviation to summarize central tendency and spread, and on probabilities derived from the normal model to make inferences.