Statistics Notes: Chapters 7–10 — Point Estimation, Interval Estimation, and Hypothesis Testing
Chapter 7: Point Estimation and Sampling Distributions
Distributions and parameters
Normal distribution: shape controlled by mean \mu and standard deviation \sigma
Binomial distribution: shape influenced by population proportion p
Parameters determine likelihood of observing sample results; often population parameters are unknown
Key idea
Use samples to learn about population parameters when population parameters are unknown
Example illustrating sampling from a binomial viewpoint
Pollster expects agree/disagree responses to follow a binomial distribution with unknown population proportion p
The sample proportion \hat{p} provides information about the true p
Sampling plans and experimental designs (7.1)
Sampling plan / design determines how a sample is selected and affects inference reliability
Randomness presence distinguishes plans:
Probabilistic sampling plans: simple random, stratified random, cluster, systematic random sampling
Non-probabilistic plans: convenience, judgement (purposive), quota sampling
Simple Random Sampling (SRS)
Every sample of size n has equal chance of being selected
Example: from N=4 objects {a,b,c,d}, sample size n=2 yields S = {ab, ac, ad, bc, bd, cd} with equal probability 1/6 each
Random numbers (e.g., RAND() in MS Excel) can be used to implement SRS; numbers can be kept static via Paste Special > Values
Stratified Sampling
Divide population into homogeneous strata, take simple random samples from each stratum, then combine
Example: opinions about a school built using provincial municipalities as strata
Cluster Sampling
Population contains heterogeneous clusters; randomly select clusters and survey all elements within selected clusters
Example: check shirts in randomly chosen boxes (clusters) within a shipment
Systematic Sampling
Randomly pick the first element, then select every k-th element after that (k = N/n)
Example: select every 10th person from a population list
Convenience Sampling
Sample chosen for ease of access; non-probabilistic and not suitable for inference to a population
Example: surveying coworkers around a coffee machine
Judgement (Purposive) Sampling
Investigator selects elements based on judgment
Example: purposive sampling in qualitative studies (e.g., assessing an educational program)
Quota Sampling
Convenience sample constrained to reflect population composition on preselected characteristics (e.g., age groups)
Introduction to sampling distributions (7.2)
Sampling distribution of a statistic: the probability distribution of a statistic (e.g., X̄) obtained from repeated samples of size n drawn from the population
Purpose: infer population parameters when population parameters are unknown
How to obtain sampling distributions
Use probability theory (derive analytically)
Use simulation (e.g., resampling)
Use theorems to derive exact or approximate forms
Example: sampling distribution for the sample mean X̄ (finite population sample without replacement)
Population: N=5 values {2,4,8,12,16}; sample size n=3; without replacement
There are C(5,3) = 10 possible samples; each is equally likely
Computed sample means (X̄) for each sample (e.g., 4.67, 6.00, 7.33, etc.)
The sampling distribution of X̄ is obtained by listing each possible X̄ and its probability (equal to 1/10 in this example)
Sampling distribution of the sample mean: practical notes (7.4)
The sampling distribution of X̄ has mean E(X̄) = μ and standard error SE(X̄) = σ/√n
Finite population correction (FPC) when sampling without replacement from a finite population of size N:
If population finite (N finite): SE(X̄) = σ/√n × sqrt((N − n)/(N − 1))
If population is effectively infinite (N large) or sampling fraction small: SE(X̄) ≈ σ/√n
Central Limit Theorem (7.3)
For non-normal populations, as n becomes large (n ≥ 30 is a common rule of thumb), the sampling distribution of X̄ is approximately normal with mean μ and standard error σ/√n
The approximation is tighter as n increases; the spread of the sampling distribution of X̄ (the standard error) is smaller than the spread of the population variance because σ/√n < σ
Illustrations: uniform distribution and skewed distributions show that for n ≥ 30, the sampling distribution of X̄ tends toward normal form
Special cases:
If the population is normal, X̄ is exactly normal for any n
Calculating probabilities for the sample mean (7.4)
If X̄ is normal or approximately normal, probabilities are computed via Z = (X̄ − μ) / SE(X̄)
Steps:
1) Identify μ and SE(X̄)
2) Determine the region of interest under the normal curve
3) Standardize X̄ to Z
4) Use standard normal tables or software (NORM.DIST/NORM.S.DIST) to obtain probabilityExample: salaries of managers
μ = 51,800; σ = 4,000; n = 100; X̄ within $500 of μ: compute SE(X̄) = σ/√n = 4000/10 = 400
P(51300 < X̄ < 52300) = P(-1.25 < Z < 1.25) ≈ 0.7888
Excel references: NORM.S.DIST(1.25) − NORM.S.DIST(-1.25) ≈ 0.7887
Example: uniform distribution 20–40 (Example 7.5, 7.6, 7.7)
X ~ Uniform(a,b) with a=20, b=40; E(X) = (a+b)/2 = 30; Var(X) = (b−a)^2/12 = 400/12 ≈ 33.333; SD ≈ 5.7735
For X̄ of n = 36, SE(X̄) = sqrt(Var(X)/n) = sqrt(33.333/36) ≈ 0.9623
Probability 28 < X̄ < 32:
Z bounds: (28 − 30)/0.9623 ≈ -2.08 and 2.08
P ≈ 0.9624
For X̄ with X̄ in 28–32 using Excel or standard normal tables yields ≈ 0.9623
If σ known or for n large, the Z-values align with the standard normal distribution
Absolute-value notes
|x| ≥ a implies x ≥ a or x ≤ −a; |x| ≤ a implies −a ≤ x ≤ a
Example 7.7: sampling error probability for X̄ when σ̄ = 25
P(|X̄ − μ| > 10) = P(|Z| > 0.4) ≈ 2 × (1 − Φ(0.4)) ≈ 0.6892
Connection to binomial distribution (link to Chapter 5)
Chapter 7 ties in the binomial context with sampling distributions for proportions when using p̂
Self-evaluation and classroom exercises (Chapter 7 highlights)
Class exercises connecting population duration (Alzheimer’s) to sampling distributions and probabilities
Probability queries using normal approximations and inverse distributions (NORM.INV, NORM.DIST, NORM.S.DIST)
Chapter 8: Interval Estimation
8.1: Point Estimation
Point estimators: statistics used to estimate population parameters
Desirable properties of estimators:
Unbiased: the mean of the estimator’s sampling distribution equals the true parameter
Minimum variance: smallest possible spread of the estimator’s sampling distribution
Sampling error: the distance between the estimator and the true parameter
Common population parameters and their point estimators (tablematic summary):
Mean: parameter = μ; estimator = \bar{X}; sampling error = |\bar{X} − μ|
Variance: parameter = σ^2; estimator = s^2 or σ^2hat; sampling error = |σhat^2 − σ^2|
Standard deviation: parameter = σ; estimator = s; sampling error = |s − σ|
Proportion: parameter = p; estimator = \hat{p} = X/n; sampling error = |\hat{p} − p|
Notes: random variables are denoted with uppercase; statistics/estimates with lowercase
8.2: Estimation of the Population Mean (μ)
Two cases:
8.2.1: σ known
Confidence interval form: ar{X} \,\pm\, z_{\alpha/2} \ Big(\frac{\sigma}{\sqrt{n}}\Big)
If population is normal, the sampling distribution of X̄ is exactly normal; otherwise, CLT applies for large n (n ≥ 30)
General formula for CI when σ known and normal population: use z-quantiles
Key definitions: point estimate = \bar{X}; margin of error = z_{\alpha/2} (σ/√n)
Critical values: z_{α/2} from Standard Normal
Example 8.1 (n=16, x̄=58, σ=10): 95% CI: 58 ± 1.96*(10/√16) = (53.1, 62.9); Margin of Error = 4.9
90% CI and 99% CI examples show similar calculations with z_ values 1.645 and 2.576
Table of useful values: z_{α/2} for common confidence levels
8.2.2: σ unknown
If σ is unknown and n < 30, and the underlying population is normal, the t-distribution is used
Degrees of freedom (df) = n − 1
Confidence intervals use the t-distribution quantiles: ar{X} \,\pm\, t_{\alpha/2, df} \Big(\frac{s}{\sqrt{n}}\Big)
Relationship between t and z: as df → ∞, t{df} → z{α}
Example 8.2: n=15, x̄=53.87, s=6.82 => 95% CI: 53.87 ± 2.145(6.82/√15) ≈ (50.09, 57.65) with df=14; 90% CI: 53.87 ± 1.761(6.82/√15) ≈ (50.77, 56.97)
8.3: Estimation of the Population Proportion, p
For large samples (n p̂ ≥ 5 and n(1 − p̂) ≥ 5), p̂ is approximately normal with:
Mean: E(p̂) = p
Standard error: SE(p̂) ≈ sqrt{ p̂ (1 − p̂) / n } (finite population correction can be applied if sampling without replacement)
Finite population correction (FPC) for SE when sampling without replacement from a finite population (N):
SE(p̂) = sqrt{ p̂ (1 − p̂) / n } × sqrt{ (N − n) / (N − 1) }
Infinite population (N large) approximation: SE(p̂) ≈ sqrt{ p̂ (1 − p̂) / n }
Normal approximation criteria: n p̂ > 5 and n (1 − p̂) > 5
Confidence interval for p: \hat{p} \pm z_{\alpha/2} \ sqrt{ \frac{\hat{p} (1 − \hat{p})}{n} }
Examples:
Example 8.3: 397/902 → p̂ = 0.4401; 95% CI: 0.4401 ± 1.96sqrt(0.44010.5599/902) ≈ (0.4077, 0.4725)
Example 8.4: 38/200 → p̂ = 0.19; 95% CI: 0.19 ± 1.96sqrt(0.190.81/200) ≈ (0.135, 0.245)
Important general note: When n is not large, alternatives based on the binomial distribution may be used instead of normal approximation
Chapter 9: Hypothesis Tests (One Sample)
Conceptual foundation
Hypothesis: a claim about a population parameter
Null hypothesis H0: a tentative statement to test; typically a statement of equality or a boundary value
Alternative hypothesis Ha: the claim you want to test against H0
Types of hypotheses for population mean μ (one-sample tests):
One-sided tests: Ha: μ < μ0 or Ha: μ > μ0
Two-sided tests: Ha: μ ≠ μ0
Level of significance α: probability of rejecting H0 when H0 is true (Type I error)
Power: probability of correctly rejecting H0 when Ha is true (1 − β)
Type I error: reject H0 when H0 is true
Type II error: fail to reject H0 when Ha is true
Hypothesis tests on the population mean: σ known (9.2)
Test statistic (z-test): Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}
Conditions for use:
σ known, population normal or n large, or use CLT for large n
Decision rules
Critical value approach: reject H0 if Z falls in the rejection region determined by z{α/2} or z{α} depending on one- or two-tailed test
P-value approach: reject H0 if p-value ≤ α
Relationship to confidence intervals: for two-sided tests, H0 is rejected if μ0 falls outside the (1 − α) CI for μ
Example 9.2/9.3: (illustrative) e.g., given μ0, σ known, compute Z and compare to critical values or p-value
Hypothesis tests on the population mean: σ unknown (9.3)
Use t-statistic when σ is unknown: T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}} with df = n − 1 (for small samples; normal approximation for large n)
Decision rules mirror z-tests but with t-distribution quantiles (t_{α, df})
Hypothesis tests on the population proportion (9.4)
Test statistic for a single proportion (large-sample normal approximation):
Z = \frac{\hat{p} - p0}{\sqrt{ p0 (1 - p_0) / n}}
Use p0 in the standard error (under H0), not p̂
Conditions: n p0 ≥ 5 and n (1 − p0) ≥ 5 for validity of normal approximation
Decision rules follow critical-value and p-value approaches as above
Summary of test statistics and decision rules
For Z-tests (mean, σ known or large n): Z-statistic, critical values from N(0,1)
For T-tests (mean, σ unknown): T-statistic, critical values from t_{df}
For proportions (p̂): Z-statistic with SE based on p0 under H0
All three approaches (critical value, p-value, and CI relation) give consistent decisions
Relationship between interval estimation and hypothesis testing (9.2.4)
The confidence interval for μ (with known σ) corresponds to a two-tailed test at level α: if μ0 lies inside the CI, fail to reject H0; if outside, reject H0
Chapter 10: Hypothesis Testing (Two Samples)
Overview
Compare two population means or two population proportions
Distinguish between independent samples and matched pairs (dependent samples)
10.1 Inferences about the difference between two means (independent populations; σ1 and σ2 known)
Assumptions: two independent samples; known population standard deviations; normal populations or large samples
Point estimate of difference: \hat{μ}1 − \hat{μ}2
Standard error for the difference: \sigma{\hat{μ}1 - \hat{μ}2} = \sqrt{\frac{\sigma1^2}{n1} + \frac{\sigma2^2}{n_2}}
Test statistic (z): Z = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\sigma1^2/n1 + \sigma2^2/n_2}} where D0 is the hypothesised difference (often 0)
Decision rules follow z-critical values or p-values; example provided with two sample means
10.2 Inferences about the difference between two means (independent populations; σ1 and σ2 unknown)
Two subcases: unequal variances vs. equal variances (Welch vs. pooled t-test)
10.2.1 Unequal variances (Welch-Satterthwaite approach)
Test statistic: T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}
Degrees of freedom approximated by the Welch formula (complicated; shown as general form in notes)
10.2.2 Equal variances (pooled t-test)
Pooled variance: sp^2 = \frac{(n1 - 1)s1^2 + (n2 - 1)s2^2}{n1 + n_2 - 2}
Test statistic: T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{sp^2 \left( \frac{1}{n1} + \frac{1}{n2} \right)}} with df = n1 + n2 - 2
10.3 Inferences about the difference between two population proportions (independent samples)
Pooled proportion under H0: \hat{p} = \frac{n1 \hat{p}1 + n2 \hat{p}2}{n1 + n2}
Standard error under H0: \sqrt{\hat{p} (1 - \hat{p}) \left( \frac{1}{n1} + \frac{1}{n2} \right)}
Test statistic (Z): Z = \frac{\hat{p}1 - \hat{p}2}{\sqrt{\hat{p} (1 - \hat{p}) \left( \frac{1}{n1} + \frac{1}{n2} \right)}}
10.4: Summary of two-sample tests
Distinguish between independent samples and matched (paired) samples
For matched samples, work with the differences d = X1 − X2 and test on μd
Matched pair test statistic for small samples: T = \frac{\bar{d} - \mud}{sd / \sqrt{n}} with df = n − 1
For large samples, CLT allows z-approximation to the distribution of d̄
Relationships and examples ( Chapter 10 )
Example scenarios cover: testing differences in means between two centers, evaluating two processing methods, and comparing proportions across groups
A standard workflow is followed in each case: state H0 and Ha, choose α, compute test statistic, decide via critical values or p-value, and interpret in context
Quick reference: common formulas
Sampling distribution of the sample mean
For infinite population: X̄ \sim N( μ, \frac{σ^2}{n} ) approximately when n large or population normal
Standard error: SE(X̄) = \frac{σ}{\sqrt{n}}
Finite population correction (without replacement): SE(X̄) = \frac{σ}{\sqrt{n}} \sqrt{\frac{N-n}{N-1}}
Sampling distribution of the sample proportion
For large samples: p̂ \approx N(p, \frac{p(1-p)}{n})
Standard error (under H0 for proportion tests): SE = \sqrt{\frac{p0 (1-p0)}{n}}
Confidence interval for the mean (known \sigma)
\bar{X} \pm z_{\alpha/2} \frac{σ}{\sqrt{n}}
Confidence interval for the mean (unknown \sigma; t-distribution)
\bar{X} \pm t_{\alpha/2, df} \frac{S}{\sqrt{n}}
df = n − 1 (for a single mean)
Confidence interval for the population proportion
p̂ \pm z_{\alpha/2} \sqrt{\frac{p̂(1-p̂)}{n}}
Hypothesis test for a single mean (z-test, σ known)
Z = \frac{\bar{X} - μ_0}{σ/\sqrt{n}}
Hypothesis test for a single mean (t-test, σ unknown)
T = \frac{\bar{X} - μ_0}{S/\sqrt{n}}
Hypothesis test for a single proportion
Z = \frac{\hat{p} - p0}{\sqrt{\frac{p0(1-p_0)}{n}}}
Hypothesis test for two independent means (known sigmas)
Z = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\sigma1^2/n1 + \sigma2^2/n_2}}
Hypothesis test for two independent means (unknown sigmas, unequal)
T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}
df approximated via Welch formula (not shown in full here)
Hypothesis test for two independent means (unknown sigmas, equal)
Pooled variance: sp^2 = \frac{(n1-1)s1^2 + (n2-1)s2^2}{n1+n_2-2}
T = \frac{(\bar{X}1 - \bar{X}2) - D0}{\sqrt{ sp^2 (1/n1 + 1/n2) }}
Matched pairs (two related samples)
Differences: di = X{1i} - X_{2i}
\bar{d} = \frac{\sum di}{n}, \; sd = \sqrt{\frac{\sum (d_i - \bar{d})^2}{n-1}}
Small samples: T = \frac{\bar{d} - μd}{sd / \sqrt{n}} ; df = n − 1
Difference in two population proportions
Z = \frac{\hat{p}1 - \hat{p}2}{\sqrt{\hat{p}(1-\hat{p}) (1/n1 + 1/n2)}} ; where \hat{p} is the pooled proportion
Pooled proportion: \hat{p} = \frac{n1 \hat{p}1 + n2 \hat{p}2}{n1 + n2}
Practice and table references
Standard normal z-table (N(0,1)) and t-table (t_{df}) used throughout for critical values and p-values
t-table values are listed for various df (e.g., df = 14, 21, ∞) in standard references
For common confidence levels:
90%: z{0.05} = 1.645; t{0.05, df} → 1.645 as df → ∞
95%: z{0.025} = 1.96; t{0.025, df} → 1.96 as df → ∞
99%: z{0.005} = 2.576; t{0.005, df} → 2.576 as df → ∞
Connections and practical implications
Confidence intervals provide a range of plausible values for population parameters; they are inherently linked to hypothesis tests, especially for two-sided tests
When σ is known and population normal, z-based methods are appropriate; when σ is unknown, use t-based methods with df = n − 1 (or pooled df for equal variances)
For proportions, normal approximation relies on adequate sample size; when not satisfied, exact methods or alternative approaches should be used
Finite population correction is important when sampling a large fraction of a small population; it reduces the standard error and tightens the CI/ test
Examples highlighted from the transcript (selected key results)
Example: Probability X̄ within $500 of μ when μ = 51,800; σ = 4,000; n = 100
SE(X̄) = σ/√n = 4000/10 = 400
P(51,300 < X̄ < 52,300) = P(-1.25 < Z < 1.25) ≈ 0.7888
Example: Uniform(20,40); E(X)=30; Var(X)=400/12 ≈ 33.33; SD ≈ 5.773; For n=36, SE(X̄) ≈ 0.9623; P(28 < X̄ < 32) ≈ P(-2.08 < Z < 2.08) ≈ 0.9624
Example: 95% CI for μ with σ known; n=16; x̄=58; σ=10
CI: 58 ± 1.96 × (10/√16) = (53.1, 62.9); Margin of error = 4.9
Example: 95% CI for μ with σ unknown; n=15; x̄=53.87; s=6.82
df=14; 95% CI: 50.09 to 57.65; 90% CI: 50.77 to 56.97
Example: Proportion confidence interval (p̂≈0.4401; n=902)
95% CI: (0.4077, 0.4725); Margin of error ≈ 0.0324
Example: Hypothesis test (two-sample, unequal variances) with t ≈ 2.27; df ≈ 21; p-value ≈ 0.0169; reject H0 at α=0.05
Example: Two-sample proportion test with pooled p̂ ≈ 0.1127; n1=250, n2=300; Z ≈ 1.85; fail to reject at α=0.10
If you want, I can tailor these notes to a specific chapter focus (e.g., only CI and hypothesis testing, or only Chapter 7 concepts) or expand any of the examples with step-by-step calculations.