Study Notes on One-sample Means with the t-distribution
One-sample Means with the ( t )-Distribution
Overview
- The sample mean has a nearly normal sampling distribution similar to that of the sample proportion.
- The mean of the sampling distribution is ( \mu ) and the standard error is ( \frac{\sigma}{\sqrt{n}} ).
Key Conditions
- Conditions that must be checked:
- Independence: Sample observations must be independent, i.e., a simple random sample. Data can also be derived from a random process (e.g., rolling a die).
- Sample Size:
- If ( n < 30 ) and there are no clear outliers, presume the data comes from a nearly normal distribution.
- If ( n \geq 30 ) and no extreme outliers, assume distribution of ( \bar{x} ) is nearly normal, even if individual observations are not.
The ( t )-Distribution
- We typically do not know ( \sigma ) (the population standard deviation), thus we use ( s ) (the sample standard deviation).
- The adjustment requires us to use the Students' ( t )-distribution instead of the normal distribution.
- Characteristics of the ( t )-distribution:
- Symmetric bell curve, similar shape to normal distribution but has higher kurtosis (thicker tails).
- Shape varies based on degrees of freedom, set as ( n - 1 ).
Confidence Intervals
- Confidence intervals utilize the following formula:
[ \bar{x} \pm t^*_{df} \times \frac{s}{\sqrt{n}} ] - Steps to construct a confidence interval:
- Prepare:
- Identify:
- ( \mu ): parameter of interest (unknown value).
- ( \bar{x} ): sample mean (point estimate).
- ( s ): sample standard deviation.
- ( n ): sample size.
- Confidence level.
- Check: Verify conditions for ( \bar{x} ) being nearly normal.
- Calculate:
- Standard Error: ( \frac{s}{\sqrt{n}} )
- Critical Value: Use ( t^*_{df} ) computed via Excel as ( =T.INV(\alpha/2, n-1) )
- Construct Interval: Calculate using ( \bar{x} \pm t^*_{df} \times \frac{s}{\sqrt{n}} )
- Conclude: Interpret the confidence interval in context.
Hypothesis Testing
- Steps for conducting a one-mean hypothesis test:
- Prepare:
- Identify:
- ( \mu ): parameter of interest (unknown value).
- Null Hypothesis ( H_0 ).
- Alternative Hypothesis ( H1 ) or ( HA ).
- ( \bar{x} ): sample mean (point estimate).
- ( s ): sample standard deviation.
- ( n ): sample size.
- ( \alpha ): significance level (probability of Type I error).
- Check: Verify conditions to ensure ( \bar{x} ) is nearly normal.
- Calculate:
- Standard Error: ( \frac{s}{\sqrt{n}} )
- Test Statistic: ( T = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}} )
- ( p )-value: Based on the ( t )-distribution.
- Conclude: Compare ( p )-value to ( \alpha ) and provide a conclusion based on the problem context.
Choosing a Sample Size
- For calculating sample size, default to normal distribution using:
[ n > \left( \frac{z \cdot \sigma}{E} \right)^2 ]
- Where ( z ) is the critical value and ( E ) is the margin of error.
- Always round up when determining ( n ).
Examples
Example 1: Heights of NFL Players
- Mean height: ( \mu = 74.1684 ) inches, Standard Deviation: ( \sigma = 2.6660 ) inches, Sample size: ( n = 100 ).
- Applying Central Limit Theorem: ( \bar{X} \sim N(74.1684, \frac{2.6660}{\sqrt{100}}) )
- Standard Error for the sample mean ( \bar{X} ): ( \sigma / \sqrt{n} = \frac{2.6660}{10} = 0.26660 ).
Example 2: Duolingo Match Madness
- Sample mean of matches: ( \bar{x} = 174.5221 ), standard deviation ( s = 10.8801 ), sample size ( n=36 ).
- Testing at significance level ( \alpha = 0.01 ), hypotheses:
- Null: ( H_0: \mu = 160 )
- Alternative: ( H_A: \mu > 160 )
- Test statistic: ( T = \frac{174.5221-160}{1.81335} \approx 8.00844 )
- ( p )-value: approx. ( 1.00313 \times 10^{-9} ), leading to rejection of ( H_0 ).
Critical Values (Textbook Problems)
- Critical Values for Various Confidence Levels and Sample Sizes:
- (a) ( n = 6, CL = 90\% ): ( df = 5 ), ( t^* = 2.015048 )
- (b) ( n = 21, CL = 98\% ): ( df = 20 ), ( t^* = 2.527977 )
- (c) ( n = 29, CL = 95\% ): ( df = 28 ), ( t^* = 2.048407 )
- (d) ( n = 12, CL = 99\% ): ( df = 11 ), ( t^* = 3.105807 )
Example 3: Piano Lessons
- Sample of children: ( n = 20 ), ( \bar{x} = 4.6 ) years, ( s = 2.2 ) years.
- Testing Georgianna's claim:
- Null: ( H_0: \mu = 5 )
- Alternative: ( H_A: \mu \neq 5)
- Two-tailed test:
- Test statistic using Excel: ( T = -0.81312… )
- ( p )-value: ( 0.426224… )
- Conclusion: Fail to reject ( H_0 ); insufficient evidence to suggest population mean is different from 5.
- Constructed confidence interval: ( 3.570 < \mu < 5.630 )
- Results from hypothesis test and confidence interval agree, indicating that 5 years is a reasonable average.