Study Notes on One-sample Means with the t-distribution

One-sample Means with the ( t )-Distribution

Overview

  • The sample mean has a nearly normal sampling distribution similar to that of the sample proportion.
  • The mean of the sampling distribution is ( \mu ) and the standard error is ( \frac{\sigma}{\sqrt{n}} ).

Key Conditions

  • Conditions that must be checked:
    • Independence: Sample observations must be independent, i.e., a simple random sample. Data can also be derived from a random process (e.g., rolling a die).
    • Sample Size:
    • If ( n < 30 ) and there are no clear outliers, presume the data comes from a nearly normal distribution.
    • If ( n \geq 30 ) and no extreme outliers, assume distribution of ( \bar{x} ) is nearly normal, even if individual observations are not.

The ( t )-Distribution

  • We typically do not know ( \sigma ) (the population standard deviation), thus we use ( s ) (the sample standard deviation).
  • The adjustment requires us to use the Students' ( t )-distribution instead of the normal distribution.
  • Characteristics of the ( t )-distribution:
    • Symmetric bell curve, similar shape to normal distribution but has higher kurtosis (thicker tails).
    • Shape varies based on degrees of freedom, set as ( n - 1 ).

Confidence Intervals

  • Confidence intervals utilize the following formula:
    [ \bar{x} \pm t^*_{df} \times \frac{s}{\sqrt{n}} ]
  • Steps to construct a confidence interval:
    1. Prepare:
    • Identify:
      • ( \mu ): parameter of interest (unknown value).
      • ( \bar{x} ): sample mean (point estimate).
      • ( s ): sample standard deviation.
      • ( n ): sample size.
      • Confidence level.
    1. Check: Verify conditions for ( \bar{x} ) being nearly normal.
    2. Calculate:
    • Standard Error: ( \frac{s}{\sqrt{n}} )
    • Critical Value: Use ( t^*_{df} ) computed via Excel as ( =T.INV(\alpha/2, n-1) )
    • Construct Interval: Calculate using ( \bar{x} \pm t^*_{df} \times \frac{s}{\sqrt{n}} )
    1. Conclude: Interpret the confidence interval in context.

Hypothesis Testing

  • Steps for conducting a one-mean hypothesis test:
    1. Prepare:
    • Identify:
      • ( \mu ): parameter of interest (unknown value).
      • Null Hypothesis ( H_0 ).
      • Alternative Hypothesis ( H1 ) or ( HA ).
      • ( \bar{x} ): sample mean (point estimate).
      • ( s ): sample standard deviation.
      • ( n ): sample size.
      • ( \alpha ): significance level (probability of Type I error).
    1. Check: Verify conditions to ensure ( \bar{x} ) is nearly normal.
    2. Calculate:
    • Standard Error: ( \frac{s}{\sqrt{n}} )
    • Test Statistic: ( T = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}} )
    • ( p )-value: Based on the ( t )-distribution.
    1. Conclude: Compare ( p )-value to ( \alpha ) and provide a conclusion based on the problem context.

Choosing a Sample Size

  • For calculating sample size, default to normal distribution using: [ n > \left( \frac{z \cdot \sigma}{E} \right)^2 ]
    • Where ( z ) is the critical value and ( E ) is the margin of error.
    • Always round up when determining ( n ).

Examples

Example 1: Heights of NFL Players

  • Mean height: ( \mu = 74.1684 ) inches, Standard Deviation: ( \sigma = 2.6660 ) inches, Sample size: ( n = 100 ).
  • Applying Central Limit Theorem: ( \bar{X} \sim N(74.1684, \frac{2.6660}{\sqrt{100}}) )
  • Standard Error for the sample mean ( \bar{X} ): ( \sigma / \sqrt{n} = \frac{2.6660}{10} = 0.26660 ).

Example 2: Duolingo Match Madness

  • Sample mean of matches: ( \bar{x} = 174.5221 ), standard deviation ( s = 10.8801 ), sample size ( n=36 ).
  • Testing at significance level ( \alpha = 0.01 ), hypotheses:
    • Null: ( H_0: \mu = 160 )
    • Alternative: ( H_A: \mu > 160 )
    • Test statistic: ( T = \frac{174.5221-160}{1.81335} \approx 8.00844 )
    • ( p )-value: approx. ( 1.00313 \times 10^{-9} ), leading to rejection of ( H_0 ).

Critical Values (Textbook Problems)

  • Critical Values for Various Confidence Levels and Sample Sizes:
    • (a) ( n = 6, CL = 90\% ): ( df = 5 ), ( t^* = 2.015048 )
    • (b) ( n = 21, CL = 98\% ): ( df = 20 ), ( t^* = 2.527977 )
    • (c) ( n = 29, CL = 95\% ): ( df = 28 ), ( t^* = 2.048407 )
    • (d) ( n = 12, CL = 99\% ): ( df = 11 ), ( t^* = 3.105807 )

Example 3: Piano Lessons

  • Sample of children: ( n = 20 ), ( \bar{x} = 4.6 ) years, ( s = 2.2 ) years.
  • Testing Georgianna's claim:
    • Null: ( H_0: \mu = 5 )
    • Alternative: ( H_A: \mu \neq 5)
  • Two-tailed test:
    • Test statistic using Excel: ( T = -0.81312… )
    • ( p )-value: ( 0.426224… )
    • Conclusion: Fail to reject ( H_0 ); insufficient evidence to suggest population mean is different from 5.
    • Constructed confidence interval: ( 3.570 < \mu < 5.630 )
    • Results from hypothesis test and confidence interval agree, indicating that 5 years is a reasonable average.