Statistics Notes: Chapters 7–10 — Point Estimation, Interval Estimation, and Hypothesis Testing

Chapter 7: Point Estimation and Sampling Distributions

Distributions and parameters
- Normal distribution: shape controlled by mean \mu and standard deviation \sigma
- Binomial distribution: shape influenced by population proportion p
- Parameters determine likelihood of observing sample results; often population parameters are unknown
Key idea
- Use samples to learn about population parameters when population parameters are unknown
Example illustrating sampling from a binomial viewpoint
- Pollster expects agree/disagree responses to follow a binomial distribution with unknown population proportion p
- The sample proportion \hat{p} provides information about the true p
Sampling plans and experimental designs (7.1)
- Sampling plan / design determines how a sample is selected and affects inference reliability
- Randomness presence distinguishes plans:
- Probabilistic sampling plans: simple random, stratified random, cluster, systematic random sampling
- Non-probabilistic plans: convenience, judgement (purposive), quota sampling
- Simple Random Sampling (SRS)
- Every sample of size n has equal chance of being selected
- Example: from N=4 objects {a,b,c,d}, sample size n=2 yields S = {ab, ac, ad, bc, bd, cd} with equal probability 1/6 each
- Random numbers (e.g., RAND() in MS Excel) can be used to implement SRS; numbers can be kept static via Paste Special > Values
- Stratified Sampling
- Divide population into homogeneous strata, take simple random samples from each stratum, then combine
- Example: opinions about a school built using provincial municipalities as strata
- Cluster Sampling
- Population contains heterogeneous clusters; randomly select clusters and survey all elements within selected clusters
- Example: check shirts in randomly chosen boxes (clusters) within a shipment
- Systematic Sampling
- Randomly pick the first element, then select every k-th element after that (k = N/n)
- Example: select every 10th person from a population list
- Convenience Sampling
- Sample chosen for ease of access; non-probabilistic and not suitable for inference to a population
- Example: surveying coworkers around a coffee machine
- Judgement (Purposive) Sampling
- Investigator selects elements based on judgment
- Example: purposive sampling in qualitative studies (e.g., assessing an educational program)
- Quota Sampling
- Convenience sample constrained to reflect population composition on preselected characteristics (e.g., age groups)
Introduction to sampling distributions (7.2)
- Sampling distribution of a statistic: the probability distribution of a statistic (e.g., X̄) obtained from repeated samples of size n drawn from the population
- Purpose: infer population parameters when population parameters are unknown
- How to obtain sampling distributions
- Use probability theory (derive analytically)
- Use simulation (e.g., resampling)
- Use theorems to derive exact or approximate forms
Example: sampling distribution for the sample mean X̄ (finite population sample without replacement)
- Population: N=5 values {2,4,8,12,16}; sample size n=3; without replacement
- There are C(5,3) = 10 possible samples; each is equally likely
- Computed sample means (X̄) for each sample (e.g., 4.67, 6.00, 7.33, etc.)
- The sampling distribution of X̄ is obtained by listing each possible X̄ and its probability (equal to 1/10 in this example)
Sampling distribution of the sample mean: practical notes (7.4)
- The sampling distribution of X̄ has mean E(X̄) = μ and standard error SE(X̄) = σ/√n
- Finite population correction (FPC) when sampling without replacement from a finite population of size N:
- If population finite (N finite): SE(X̄) = σ/√n × sqrt((N − n)/(N − 1))
- If population is effectively infinite (N large) or sampling fraction small: SE(X̄) ≈ σ/√n
Central Limit Theorem (7.3)
- For non-normal populations, as n becomes large (n ≥ 30 is a common rule of thumb), the sampling distribution of X̄ is approximately normal with mean μ and standard error σ/√n
- The approximation is tighter as n increases; the spread of the sampling distribution of X̄ (the standard error) is smaller than the spread of the population variance because σ/√n < σ
- Illustrations: uniform distribution and skewed distributions show that for n ≥ 30, the sampling distribution of X̄ tends toward normal form
- Special cases:
- If the population is normal, X̄ is exactly normal for any n
Calculating probabilities for the sample mean (7.4)
- If X̄ is normal or approximately normal, probabilities are computed via Z = (X̄ − μ) / SE(X̄)
- Steps:
  1) Identify μ and SE(X̄)
  2) Determine the region of interest under the normal curve
  3) Standardize X̄ to Z
  4) Use standard normal tables or software (NORM.DIST/NORM.S.DIST) to obtain probability
- Example: salaries of managers
- μ = 51,800; σ = 4,000; n = 100; X̄ within $500 of μ: compute SE(X̄) = σ/√n = 4000/10 = 400
- P(51300 < X̄ < 52300) = P(-1.25 < Z < 1.25) ≈ 0.7888
- Excel references: NORM.S.DIST(1.25) − NORM.S.DIST(-1.25) ≈ 0.7887
Example: uniform distribution 20–40 (Example 7.5, 7.6, 7.7)
- X ~ Uniform(a,b) with a=20, b=40; E(X) = (a+b)/2 = 30; Var(X) = (b−a)^2/12 = 400/12 ≈ 33.333; SD ≈ 5.7735
- For X̄ of n = 36, SE(X̄) = sqrt(Var(X)/n) = sqrt(33.333/36) ≈ 0.9623
- Probability 28 < X̄ < 32:
- Z bounds: (28 − 30)/0.9623 ≈ -2.08 and 2.08
- P ≈ 0.9624
- For X̄ with X̄ in 28–32 using Excel or standard normal tables yields ≈ 0.9623
- If σ known or for n large, the Z-values align with the standard normal distribution
Absolute-value notes
- |x| ≥ a implies x ≥ a or x ≤ −a; |x| ≤ a implies −a ≤ x ≤ a
Example 7.7: sampling error probability for X̄ when σ̄ = 25
- P(|X̄ − μ| > 10) = P(|Z| > 0.4) ≈ 2 × (1 − Φ(0.4)) ≈ 0.6892
Connection to binomial distribution (link to Chapter 5)
- Chapter 7 ties in the binomial context with sampling distributions for proportions when using p̂
Self-evaluation and classroom exercises (Chapter 7 highlights)
- Class exercises connecting population duration (Alzheimer’s) to sampling distributions and probabilities
- Probability queries using normal approximations and inverse distributions (NORM.INV, NORM.DIST, NORM.S.DIST)

Chapter 8: Interval Estimation

8.1: Point Estimation
- Point estimators: statistics used to estimate population parameters
- Desirable properties of estimators:
- Unbiased: the mean of the estimator’s sampling distribution equals the true parameter
- Minimum variance: smallest possible spread of the estimator’s sampling distribution
- Sampling error: the distance between the estimator and the true parameter
- Common population parameters and their point estimators (tablematic summary):
- Mean: parameter = μ; estimator = \bar{X}; sampling error = |\bar{X} − μ|
- Variance: parameter = σ^2; estimator = s^2 or σ^2hat; sampling error = |σhat^2 − σ^2|
- Standard deviation: parameter = σ; estimator = s; sampling error = |s − σ|
- Proportion: parameter = p; estimator = \hat{p} = X/n; sampling error = |\hat{p} − p|
- Notes: random variables are denoted with uppercase; statistics/estimates with lowercase
8.2: Estimation of the Population Mean (μ)
- Two cases:
- 8.2.1: σ known
  - Confidence interval form: ar{X} \,\pm\, z_{\alpha/2} \ Big(\frac{\sigma}{\sqrt{n}}\Big)
  - If population is normal, the sampling distribution of X̄ is exactly normal; otherwise, CLT applies for large n (n ≥ 30)
  - General formula for CI when σ known and normal population: use z-quantiles
  - Key definitions: point estimate = \bar{X}; margin of error = z_{\alpha/2} (σ/√n)
  - Critical values: z_{α/2} from Standard Normal
  - Example 8.1 (n=16, x̄=58, σ=10): 95% CI: 58 ± 1.96*(10/√16) = (53.1, 62.9); Margin of Error = 4.9
  - 90% CI and 99% CI examples show similar calculations with z_ values 1.645 and 2.576
  - Table of useful values: z_{α/2} for common confidence levels
- 8.2.2: σ unknown
  - If σ is unknown and n < 30, and the underlying population is normal, the t-distribution is used
  - Degrees of freedom (df) = n − 1
  - Confidence intervals use the t-distribution quantiles: ar{X} \,\pm\, t_{\alpha/2, df} \Big(\frac{s}{\sqrt{n}}\Big)
  - Relationship between t and z: as df → ∞, t{df} → z{α}
  - Example 8.2: n=15, x̄=53.87, s=6.82 => 95% CI: 53.87 ± 2.145(6.82/√15) ≈ (50.09, 57.65) with df=14; 90% CI: 53.87 ± 1.761(6.82/√15) ≈ (50.77, 56.97)
8.3: Estimation of the Population Proportion, p
- For large samples (n p̂ ≥ 5 and n(1 − p̂) ≥ 5), p̂ is approximately normal with:
- Mean: E(p̂) = p
- Standard error: SE(p̂) ≈ sqrt{ p̂ (1 − p̂) / n } (finite population correction can be applied if sampling without replacement)
- Finite population correction (FPC) for SE when sampling without replacement from a finite population (N):
- SE(p̂) = sqrt{ p̂ (1 − p̂) / n } × sqrt{ (N − n) / (N − 1) }
- Infinite population (N large) approximation: SE(p̂) ≈ sqrt{ p̂ (1 − p̂) / n }
- Normal approximation criteria: n p̂ > 5 and n (1 − p̂) > 5
- Confidence interval for p: \hat{p} \pm z_{\alpha/2} \ sqrt{ \frac{\hat{p} (1 − \hat{p})}{n} }
- Examples:
- Example 8.3: 397/902 → p̂ = 0.4401; 95% CI: 0.4401 ± 1.96sqrt(0.44010.5599/902) ≈ (0.4077, 0.4725)
- Example 8.4: 38/200 → p̂ = 0.19; 95% CI: 0.19 ± 1.96sqrt(0.190.81/200) ≈ (0.135, 0.245)
Important general note: When n is not large, alternatives based on the binomial distribution may be used instead of normal approximation

Chapter 9: Hypothesis Tests (One Sample)

Conceptual foundation
- Hypothesis: a claim about a population parameter
- Null hypothesis H0: a tentative statement to test; typically a statement of equality or a boundary value
- Alternative hypothesis Ha: the claim you want to test against H0
- Types of hypotheses for population mean μ (one-sample tests):
- One-sided tests: Ha: μ < μ0 or Ha: μ > μ0
- Two-sided tests: Ha: μ ≠ μ0
- Level of significance α: probability of rejecting H0 when H0 is true (Type I error)
- Power: probability of correctly rejecting H0 when Ha is true (1 − β)
- Type I error: reject H0 when H0 is true
- Type II error: fail to reject H0 when Ha is true
Hypothesis tests on the population mean: σ known (9.2)
- Test statistic (z-test): Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}
- Conditions for use:
- σ known, population normal or n large, or use CLT for large n
- Decision rules
- Critical value approach: reject H0 if Z falls in the rejection region determined by z{α/2} or z{α} depending on one- or two-tailed test
- P-value approach: reject H0 if p-value ≤ α
- Relationship to confidence intervals: for two-sided tests, H0 is rejected if μ0 falls outside the (1 − α) CI for μ
- Example 9.2/9.3: (illustrative) e.g., given μ0, σ known, compute Z and compare to critical values or p-value
Hypothesis tests on the population mean: σ unknown (9.3)
- Use t-statistic when σ is unknown: T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}} with df = n − 1 (for small samples; normal approximation for large n)
- Decision rules mirror z-tests but with t-distribution quantiles (t_{α, df})
Hypothesis tests on the population proportion (9.4)
- Test statistic for a single proportion (large-sample normal approximation):
- Z = \frac{\hat{p} - p0}{\sqrt{ p0 (1 - p_0) / n}}
- Use p0 in the standard error (under H0), not p̂
- Conditions: n p0 ≥ 5 and n (1 − p0) ≥ 5 for validity of normal approximation
- Decision rules follow critical-value and p-value approaches as above
Summary of test statistics and decision rules
- For Z-tests (mean, σ known or large n): Z-statistic, critical values from N(0,1)
- For T-tests (mean, σ unknown): T-statistic, critical values from t_{df}
- For proportions (p̂): Z-statistic with SE based on p0 under H0
- All three approaches (critical value, p-value, and CI relation) give consistent decisions
Relationship between interval estimation and hypothesis testing (9.2.4)
- The confidence interval for μ (with known σ) corresponds to a two-tailed test at level α: if μ0 lies inside the CI, fail to reject H0; if outside, reject H0

Chapter 10: Hypothesis Testing (Two Samples)

Overview
- Compare two population means or two population proportions
- Distinguish between independent samples and matched pairs (dependent samples)
10.1 Inferences about the difference between two means (independent populations; σ1 and σ2 known)
- Assumptions: two independent samples; known population standard deviations; normal populations or large samples
- Point estimate of difference: \hat{μ}1 − \hat{μ}2
- Standard error for the difference: \sigma{\hat{μ}1 - \hat{μ}2} = \sqrt{\frac{\sigma1^2}{n1} + \frac{\sigma2^2}{n_2}}
- Test statistic (z): Z = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\sigma1^2/n1 + \sigma2^2/n_2}} where D0 is the hypothesised difference (often 0)
- Decision rules follow z-critical values or p-values; example provided with two sample means
10.2 Inferences about the difference between two means (independent populations; σ1 and σ2 unknown)
- Two subcases: unequal variances vs. equal variances (Welch vs. pooled t-test)
- 10.2.1 Unequal variances (Welch-Satterthwaite approach)
- Test statistic: T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}
- Degrees of freedom approximated by the Welch formula (complicated; shown as general form in notes)
- 10.2.2 Equal variances (pooled t-test)
- Pooled variance: sp^2 = \frac{(n1 - 1)s1^2 + (n2 - 1)s2^2}{n1 + n_2 - 2}
- Test statistic: T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{sp^2 \left( \frac{1}{n1} + \frac{1}{n2} \right)}} with df = n1 + n2 - 2
10.3 Inferences about the difference between two population proportions (independent samples)
- Pooled proportion under H0: \hat{p} = \frac{n1 \hat{p}1 + n2 \hat{p}2}{n1 + n2}
- Standard error under H0: \sqrt{\hat{p} (1 - \hat{p}) \left( \frac{1}{n1} + \frac{1}{n2} \right)}
- Test statistic (Z): Z = \frac{\hat{p}1 - \hat{p}2}{\sqrt{\hat{p} (1 - \hat{p}) \left( \frac{1}{n1} + \frac{1}{n2} \right)}}
10.4: Summary of two-sample tests
- Distinguish between independent samples and matched (paired) samples
- For matched samples, work with the differences d = X1 − X2 and test on μd
- Matched pair test statistic for small samples: T = \frac{\bar{d} - \mud}{sd / \sqrt{n}} with df = n − 1
- For large samples, CLT allows z-approximation to the distribution of d̄
Relationships and examples ( Chapter 10 )
- Example scenarios cover: testing differences in means between two centers, evaluating two processing methods, and comparing proportions across groups
- A standard workflow is followed in each case: state H0 and Ha, choose α, compute test statistic, decide via critical values or p-value, and interpret in context

Quick reference: common formulas

Sampling distribution of the sample mean
- For infinite population: X̄ \sim N( μ, \frac{σ^2}{n} ) approximately when n large or population normal
- Standard error: SE(X̄) = \frac{σ}{\sqrt{n}}
- Finite population correction (without replacement): SE(X̄) = \frac{σ}{\sqrt{n}} \sqrt{\frac{N-n}{N-1}}
Sampling distribution of the sample proportion
- For large samples: p̂ \approx N(p, \frac{p(1-p)}{n})
- Standard error (under H0 for proportion tests): SE = \sqrt{\frac{p0 (1-p0)}{n}}
Confidence interval for the mean (known \sigma)
- \bar{X} \pm z_{\alpha/2} \frac{σ}{\sqrt{n}}
Confidence interval for the mean (unknown \sigma; t-distribution)
- \bar{X} \pm t_{\alpha/2, df} \frac{S}{\sqrt{n}}
- df = n − 1 (for a single mean)
Confidence interval for the population proportion
- p̂ \pm z_{\alpha/2} \sqrt{\frac{p̂(1-p̂)}{n}}
Hypothesis test for a single mean (z-test, σ known)
- Z = \frac{\bar{X} - μ_0}{σ/\sqrt{n}}
Hypothesis test for a single mean (t-test, σ unknown)
- T = \frac{\bar{X} - μ_0}{S/\sqrt{n}}
Hypothesis test for a single proportion
- Z = \frac{\hat{p} - p0}{\sqrt{\frac{p0(1-p_0)}{n}}}
Hypothesis test for two independent means (known sigmas)
- Z = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\sigma1^2/n1 + \sigma2^2/n_2}}
Hypothesis test for two independent means (unknown sigmas, unequal)
- T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}
- df approximated via Welch formula (not shown in full here)
Hypothesis test for two independent means (unknown sigmas, equal)
- Pooled variance: sp^2 = \frac{(n1-1)s1^2 + (n2-1)s2^2}{n1+n_2-2}
- T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{ sp^2 (1/n1 + 1/n2) }}
Matched pairs (two related samples)
- Differences: di = X{1i} - X_{2i}
- \bar{d} = \frac{\sum di}{n}, \; sd = \sqrt{\frac{\sum (d_i - \bar{d})^2}{n-1}}
- Small samples: T = \frac{\bar{d} - μd}{sd / \sqrt{n}} ; df = n − 1
Difference in two population proportions
- Z = \frac{\hat{p}1 - \hat{p}2}{\sqrt{\hat{p}(1-\hat{p}) (1/n1 + 1/n2)}} ; where \hat{p} is the pooled proportion
- Pooled proportion: \hat{p} = \frac{n1 \hat{p}1 + n2 \hat{p}2}{n1 + n2}

Practice and table references

Standard normal z-table (N(0,1)) and t-table (t_{df}) used throughout for critical values and p-values
t-table values are listed for various df (e.g., df = 14, 21, ∞) in standard references
For common confidence levels:
- 90%: z{0.05} = 1.645; t{0.05, df} → 1.645 as df → ∞
- 95%: z{0.025} = 1.96; t{0.025, df} → 1.96 as df → ∞
- 99%: z{0.005} = 2.576; t{0.005, df} → 2.576 as df → ∞

Connections and practical implications

Confidence intervals provide a range of plausible values for population parameters; they are inherently linked to hypothesis tests, especially for two-sided tests
When σ is known and population normal, z-based methods are appropriate; when σ is unknown, use t-based methods with df = n − 1 (or pooled df for equal variances)
For proportions, normal approximation relies on adequate sample size; when not satisfied, exact methods or alternative approaches should be used
Finite population correction is important when sampling a large fraction of a small population; it reduces the standard error and tightens the CI/ test

Examples highlighted from the transcript (selected key results)

Example: Probability X̄ within $500 of μ when μ = 51,800; σ = 4,000; n = 100
- SE(X̄) = σ/√n = 4000/10 = 400
- P(51,300 < X̄ < 52,300) = P(-1.25 < Z < 1.25) ≈ 0.7888
Example: Uniform(20,40); E(X)=30; Var(X)=400/12 ≈ 33.33; SD ≈ 5.773; For n=36, SE(X̄) ≈ 0.9623; P(28 < X̄ < 32) ≈ P(-2.08 < Z < 2.08) ≈ 0.9624
Example: 95% CI for μ with σ known; n=16; x̄=58; σ=10
- CI: 58 ± 1.96 × (10/√16) = (53.1, 62.9); Margin of error = 4.9
Example: 95% CI for μ with σ unknown; n=15; x̄=53.87; s=6.82
- df=14; 95% CI: 50.09 to 57.65; 90% CI: 50.77 to 56.97
Example: Proportion confidence interval (p̂≈0.4401; n=902)
- 95% CI: (0.4077, 0.4725); Margin of error ≈ 0.0324
Example: Hypothesis test (two-sample, unequal variances) with t ≈ 2.27; df ≈ 21; p-value ≈ 0.0169; reject H0 at α=0.05
Example: Two-sample proportion test with pooled p̂ ≈ 0.1127; n1=250, n2=300; Z ≈ 1.85; fail to reject at α=0.10

If you want, I can tailor these notes to a specific chapter focus (e.g., only CI and hypothesis testing, or only Chapter 7 concepts) or expand any of the examples with step-by-step calculations.