1/19
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Point Estimate
This is a single value used to estimate a population parameter. For example, the sample mean (xˉ) is an unbiased point estimate of the population mean (μ). However, point estimates are almost always considered "wrong"
Population parameter
numerical quantities that describe the characteristics of an entire population, such as mean, variance and express the true value.
Interval Estimate (Confidence Interval, CI)
Because a point estimate is likely incorrect, we calculate an interval (from a to b) that is highly likely to contain the true population parameter (μ or π). Confidence intervals are preferred because they contain more information than a point estimate
Confidence Interval
- estimates a ranger where this true value (population parameter) might lie
Margin of error
Margin of error is a statistical term that represents the range of uncertainty or variability around an estimate or measurement.
- how much the sample results may differ from the true values of the total population.
A Confidence Interval is defined by two key elements:
The estimate and the level of confidence.
The level of confidence
is the probability that the estimated interval contains the population parameter
Interpretation of the level of confidence
If we take an infinite number of samples and calculate a CI for each (e.g., a 95% CI), the true population mean (μ) would be contained within those calculated intervals in 95% of the cases
Significance Level
This is the probability that the confidence interval will not cover the population parameter
Standard Critical Values (Z-scores):
Common confidence levels correspond to specific Z-scores:
◦ 90% CI (c=0.9,α=0.1): ±1.645.
◦ 95% CI (c=0.95,α=0.05): ±1.96.
◦ 99% CI (c=0.99,α=0.01): ±2.576
There are two primary ways to make a confidence interval smaller (more precise)
1. Increase the Sample Size (): A larger sample size reduces the width of the interval when the coverage rate is fixed. This is generally the better strategy.
2. Decrease the Coverage Rate (c): A lower coverage rate (e.g., 90% instead of 99%) results in a narrower interval when the sample size is fixed
Highlights
- Higher coverage rate (e.g., 99%) → wider interval → You're more confident, so you need a bigger range to be sure.
- Larger sample size → narrower interval → More data gives a clearer picture, so you don't need as wide a range.
- Lowering coverage rate to shrink the interval is risky → You might get a smaller range, but you're less sure it's correct.
Best practice:
If you want a smaller interval (more precision), increase your sample size, not decrease your confidence.
Formulas for the Population Mean
The method used to calculate the CI for the population mean (μ) depends on whether the population standard deviation (σ) is known and whether the sample size (n) is large enough.
Situation 1: is known
xˉ±zα/2σ2/n
Requires the population to be normal, but this situation is almost never true in reality
Situation 2: is unknown, is Normal
xˉ±tα/2s2/n
Uses the t-distribution. This is rarely true in reality
Situation 3: unknown, is Non-Normal
xˉ±tα/2s2/n
Used when the distribution is unknown, provided the sample size is large (n≥30) due to the Central Limit Theorem (CLT). This situation is often true in reality
Why use t-Distribution
- Estimate population mean µ
- Population standard deviation is unknown.
- Sample size is small (n<30)
T-distribution
- Bell-shaped and symmetric, similar to the standard normal distribution, but with thicker tails.
It is a family of curves defined by the degrees of freedom (), where df=n−1. As the degrees of freedom increase (around df=30), the t-distribution closely approaches the standard normal distribution (Z)
- Total area below t-curve = 1 = 100%
Formula to calculate the confidence interval for the Population Proportion
Formula: p±zα/2 x p(1−p)/n
CLT Requirement: This formula relies on the Central Limit Theorem applying to proportions, which requires that
nπ>5 and n(1−π)>5. Since π is unknown, the sample proportion p is used instead to check the criteria.