Lecture Notes on Population Estimation and Hypothesis Testing

Statistical inferences about a population are often made using sample data.
Two key types of inferences:
- Estimation of population means and proportions
- Hypothesis testing

Population Parameters: Constant, unknown values that describe population attributes.
- Mean: m (mu)
- Proportion: Π (pi)
- Standard Deviation: s (sigma)
- Variance: s^2
Sample Statistics: Variable values that depend on the selected sample.
- Mean: ar{x} (X-bar)
- Proportion: p
- Standard Deviation: s

Estimation Process:
- Involves assessing unknown parameters using sample data.
- An unbiased estimator: expected value equals true population parameter.
- Point Estimate: a single number derived from sample data.
Confidence Interval Construction:
1. Determine point estimate.
2. Calculate margin of error (dependent on sample size, standard deviation, confidence level).
3. Determine interval bounds (point estimate ± margin of error).

Smaller margins imply narrower, more precise estimates.
Three main factors affecting margin of error:
- Sample Size: Larger samples yield more precise representations.
- Variability in Data: Higher variation leads to less precision.
- Level of Confidence: Higher confidence levels increase margin of error.
- Common Confidence Levels: 90%, 95%, 99%.

QUT wants to estimate average IQ based on a sample of 25 students with a sample mean of 115.
Determine:
- Is it a mean or proportion estimation?
- What is the point estimate?
- Is the population standard deviation known?
- Confidence level chosen?

For a confidence level of 95%, the expectation is that out of 20 samples, 19 will contain the true population mean.

90% CI for a mean height estimate: (155.51, 184.49) indicates confidence that the true population mean lies within that range.
Misinterpretation example: A given CI does not guarantee the true mean lies within it if data does not provide that interval.

z-distribution: Assumes known population standard deviation; suitable for larger samples.
t-distribution: Used when population standard deviation is unknown or sample size is small; has thicker tails appropriate for smaller sample variances.
Degrees of freedom (df) are critical for the t-distribution, typically represented as (n-1).

Example calculations should include standard error and use appropriate forms (either z or t) based on known parameters.

Prediction Interval: Predicts future observations from the population; typically wider than confidence intervals.

Definition: A hypothesis is a proposed explanation testable by scientific methods.
Categories of hypothesis testing in business:
- Service quality assessments (waiting times)
- Defective rates in production
- Operational cost evaluations.

Hypothesis testing can determine:
- Presence of specific conditions (e.g., health conditions).
- Effects (e.g., treatments, marketing strategies).
- Differences or relationships (e.g., demographic impacts on purchasing behaviors).