Confidence Intervals for Means (SD Unknown, T Interval) Study Notes

10.1: Confidence Intervals for Means (Standard Deviation Unknown, T Interval)

Historical Context

  • William Sealy Gosset's Contribution (1908)

    • Worked in quality control division for Guinness Brewery in Dublin, Ireland.

    • Published a paper applying the t-distribution to test barley quality using small samples.

    • To maintain secrecy from competitors concerning the t-test, he published as "Student."

    • The t-distribution is oftentimes referred to as Student’s t-distribution.

Why the T Distribution?

  • Central Limit Theorem states that the sampling distribution of means is approximately normal when sample size is large enough.

  • Probability calculations for certain sample means can be conducted using z-scores defined as: z = rac{(ar{x} - ext{population mean})}{( rac{ ext{population standard deviation}}{ ext{sqrt}(n)})}

    • Where $ar{x}$ is sample mean, and $n$ is sample size.

  • Since population standard deviation $
    ho$ ($ ext{SD}_{ ext{population}}$) is rarely known, we utilize sample standard deviation $s$ thus introducing variation in our z-score calculations.

Distinctions Between T and Z Distributions

  • The t-distribution is thicker and broader than the z-distribution.

  • The exact shape of the t-distribution depends on the degrees of freedom (df).

    • Formula for degrees of freedom (df):
      df = n - 1

    • As the degrees of freedom increase, the t-distribution approaches the normal distribution.

Conditions for Using the T-Distribution

  • The t-distribution is used under the following circumstances:

    1. Sample size is small.

    2. Population standard deviation is unknown.

  • The required conditions to calculate a confidence interval using the t-distribution are:

    1. The sample is drawn from a normally distributed population.

    2. Each observation must be independent.

    3. The sample size should not be overly small (preferably $n > 30$).

One-Sample T-Interval

  • Due to infrequent knowledge of the population standard deviation in practical situations, the one-sample z interval is hardly used.

  • The one-sample t interval is commonly employed to estimate population means.

    • Construction involves:

    • Using sample mean $ar{x}$ instead of population mean.

    • Using sample standard deviation $s$ rather than the population standard deviation $
      ho$.

  • The formula to calculate the one-sample t interval is as follows: ar{x} ext{ (sample mean)} ext{ } extpm t^* rac{s}{ ext{sqrt} (n)} ext{ (margin of error)}

    • Where $t^*$ is the critical t-value obtained from statistical tables or calculators.

Practical Scenarios Involving T-Intervals

  1. From a sample of 10 adults, the resting heart rates (bpm) are:

    • 50, 60, 62, 62, 70, 85, 89, 90, 93, 95.

    • A task requires sketching a histogram to visualize this data and estimate mean heart rate.

  2. Estimating the mean expenditure for veterinary services:

    • Owners of 500 dogs report an average expenditure of $250 per year with a standard deviation of $50.

    • Construct a 95% confidence interval for the mean amount spent.

  3. Estimating mean GPA:

    • GPAs of 50 students:

      • 0.5, 1.0, 1.0, 1.2, 1.2, 1.5, 1.7, 2.0, 2.0, 2.0, 2.3, 2.3, 2.4, 2.5, 2.5, 2.5, 2.5,

      • 2.7, 2.8, 2.8, 2.9, 2.9, 2.9, 3.0, 3.0, 3.0, 3.1, 3.2, 3.2, 3.3, 3.4, 3.4, 3.4,

      • 3.4, 3.5, 3.6, 3.6, 3.6, 3.6, 3.7, 3.7, 3.7, 3.8, 3.8, 3.8, 3.8, 3.9, 4.0, 4.0, 4.0.

    • Construct a 90% confidence interval for the mean GPA, rounded to two decimal places.

Expressing Confidence Intervals

  • Transforming a confidence interval expressed as $(30.5, 40.7)$ into the form: ar{x} ext{ ± } ME

    • Where ME is the margin of error.

Increasing Precision of Estimates

  • To increase the precision of our estimates, to narrow the confidence interval:

    • Consider two approaches:

    1. Smaller margin of error (ME): However, this approach may not always be feasible or practical to achieve.

    2. Change the sample size: For a more accurate margin of error, use the formula:
      E = z^* rac{
      ho}{ ext{sqrt}(n)}

    • Where $E$ represents the margin of error, $z^*$ is the critical value based on confidence level, and $
      ho$ is the population standard deviation.

Practical Example for Confidence Interval Estimation

  • A cookbook author needs to estimate the average baking time for pies. Given the standard deviation is 5 minutes, a calculation will determine the necessary sample size to ensure a 95% confidence level, keeping the margin of error within ±2 minutes.