Focus on inference for quantitative data, specifically on means.
Primary goal is to estimate the population mean using sample means.
Emphasis on the differences between sample means and population means, known as sampling variability.
Sampling variability is the natural variation between a sample mean and the population mean.
It's important to understand that a sample mean (e.g., 7.1) does not necessarily equal the population mean.
Confidence Intervals: Provide a range of values where the population mean is likely to fall.
Significance Tests: Determine if there is enough evidence to support a claim about a population mean.
Both concepts rely on sampling distributions.
A theoretical distribution of sample means.
Typically approximated using a normal distribution, but for this unit, we use the T distribution due to using the sample standard deviation (S).
T distribution is similar to the normal distribution but accounts for additional variability.
The difference in shape is more pronounced with smaller sample sizes and diminishes with larger sample sizes (degrees of freedom = sample size - 1).
Four-Step Process to Construct a Confidence Interval:
Name the test: Identify it as a one-sample T interval for the population mean with context.
Check Conditions:
Samples must be random to avoid bias.
Sample size must be less than 10% of the population for independence assumption.
Sample size must be sufficiently large:
Any size is acceptable if the population is normal.
If sample size is 30 or more and the population is unknown/not normal, use central limit theorem to assume validity.
If under 30 and population not normal, check for outliers and skewness.
Construct the Interval: Use the formula: ( \bar{x} \pm t^* \cdot \frac{s}{\sqrt{n}} )
Where ( \bar{x} ) is the sample mean.
Margin of error is ( t^* ) (critical value) multiplied by standard error ( ( \frac{s}{\sqrt{n}} ) ).
( t^* ) derived from T-table or inverse T function on calculators.
Interpret the Interval: State confidence level and context, e.g. "I am 95% confident that the true population mean for [context] is between X and Y."
A larger sample leads to a smaller margin of error.
Higher confidence levels widen the confidence intervals, but larger samples can offset that effect.
Problem Context: Find the mean time name brand AAA batteries last using a sample.
Sample mean ( ( \bar{x} ) ) = 348.50 min, Standard deviation (s) = 23.8 min, Sample size (n) = 55.
Confidence interval with 99% confidence yields values between 340.13 min and 357.00 min.
Four-Step Process:
Hypothesis: Null ( ( H_0 ): population mean is a certain value) and alternative hypothesis ( ( H_a ): population mean is not equal, greater, or less than a certain value).
Build Sampling Distribution: Assumed to be true for null hypothesis.
Must check conditions (similar to confidence intervals).
Calculate P-value: Using T-score and T-Distribution.
P-value indicates the likelihood of observing the sample mean under the null hypothesis.
Conclude: If P-value < threshold (typically 0.05), reject the null hypothesis; if not, fail to reject.
Type I Error: Rejects a true null hypothesis.
Type II Error: Fails to reject a false null hypothesis.
Power is the probability of correctly rejecting a false null hypothesis.
Increase power by increasing sample size.
Conduct a two-sample T-test for difference between means:
State the test and hypotheses.
Check conditions for both samples.
Construct confidence interval for the difference: Similar concept to single samples, but with combined statistics from both samples.
Interpret the interval in context.
Compare oak tree diameters in northern vs southern states.
Mean diameters: North = 36.6 in; South = 28.9 in. Calculate 95% CI for the difference.
Conclusion: Contextualize that northern oak trees are likely larger based on interval.
Follow similar four-step approach as previously mentioned.
Analyze whether exercise has an effect on resting heart rates with paired data.
Practice is essential; seek additional problems and resources like the Ultimate Review Packet.