Confidence Intervals and Sample Sizes

Handout Overview

  • The handout has three columns:

    • Section 7.2

    • Section 7.3 (today's topic)

    • Section 7.1

  • Sections 7.2 and 7.3 concern means and numerical data, estimating the average or mean.

  • The primary difference between sections 7.2 and 7.3 will be the availability of the population standard deviation.

Section 7.2: Minimum Sample Size

  • We didn't complete Section 7.2, specifically how to calculate the minimum sample size.

  • Formula is based on the margin of error:
    Margin \space of \space Error = z * (Standard \space Deviation / \sqrt{n})

  • We solve for n in this equation.

  • The formula includes a z-score, determined by the desired confidence level.

  • For 90% confidence:

    • Sampling distribution is considered.

    • 90% lies in the middle, bounded by lower and upper limits.

    • In terms of x, these are lower and upper bounds; in terms of z, negative and positive z values for standard normal distribution.

Finding Z-Scores

  • Use inverse norms to find Z-scores.

  • For 90% confidence:

    • 90% inside, remainder outside.

    • 100 - 90 = 10

    • 10% is outside, split in half.

    • 10 / 2 = 5

    • 5% on each tail.

  • Negative z-score calculation:

    • Inverse \space Norm(left \space tail, mean, standard \space deviation)

    • Inverse \space Norm(0.05, 0, 1)

  • Result: z-score = -1.645.

  • A table provides pre-calculated z-scores for common confidence levels.

  • For 90% confidence, the z-score of 1.645 is readily available on the table.

  • Using the table is generally easier than calculating inverse norms.

Minimum Sample Size Formula

  • Formula:
    n = (z * Standard \space Deviation / Margin \space of \space Error)^2

  • Aim is to determine the necessary amount of data to collect.

  • Example: Estimating the amount of money people spend at a store.

  • Margin of error:

    • Indicates desired accuracy.

    • Keyword: "within".

    • Example: "Accurate to within $1" implies margin of error is 1.

Example Problem (Number 8)

  • Estimate the mean number of minutes (confidence interval for average).

  • Determine the required sample size.

  • Using the sample size formula

  • Given:

    • 95% confidence

    • Margin of error: 7 minutes

    • Standard deviation: 47

  • 95% confidence implies z-score = 1.96.

  • Plugging in values:
    n = (1.96 * 47 / 7)^2

  • Result: n = 173.18

  • Round up to 174, as you cannot sample a fraction of a person.

Verifying Z-Score with Table

  • For 95% confidence, the table confirms the z-score of 1.96.

  • Alternatively, calculate using inverse norm with a left tail of 0.025.

Homework Questions: Problems 7.1 and 7.2

  • Problem Seven

Problem Seven Set-Up

  • Sample of 68 corridors, mean of 61.2 decibels, population standard deviation of 9.7.

  • Construct a 95% confidence interval.

  • This is a confidence interval for the mean.

  • Differentiating Z and T Interval

    • Z-interval (7.2 question): population standard deviation is known.

    • T-interval (7.3 question): will be discussed later.

  • Since this problem provides population standard deviation, use a z-interval.

Z-Interval Procedure

  • Using calculator, go to Stat, then Tests, and choose option 7 (Z-interval).

  • Input:

    • Summary Stats

    • Population \space Standard \space Deviation = 9.7

    • Mean = 61.2

    • Sample \space Size = 68

    • Confidence \space Level = 95

  • Enter the values into calculator. And the Results are:

    • Lower bond: 58.9

    • Upper bond: 63.5

Using Raw Data: Problem Number 5

  • Enter raw data into a list (Stat, Edit, option 1).

  • Go to Stat, Tests, option 7 (Z-interval).

  • Select "Data" instead of "Stats."

  • Input population standard deviation (e.g., 2.1).

  • Specify the list where data is entered (e.g., L2).

  • Set frequency to 1.

  • Choose confidence level (e.g., 99%).

  • Calculate. You get the results.

  • Example interpretation: First-time married couples stay together between roughly 5.4 and 7.4 years.

Section 7.3

  • Focus: Constructing confidence intervals.

  • No minimum sample size calculations.

  • Two methods:

    • Formula

    • Calculator function

  • Calculator function is similar to 7.2, but uses a different function

    • Rationale: Lack of population standard deviation.

    • Instead, use sample standard deviation.

    • Creates more uncertainty because sample standard deviation is a guess of population value.

T-Distribution

  • Incorporates added uncertainty.

  • Formula: x \space bar +- t * (s/ \sqrt{n}) * Instead of population standard deviation, use the sample standard deviation.

    • Instead of Z-score, use the t-score.

T-Distribution Table

  • Provides t-scores.

  • Example: 95% confidence interval, sample size of 20.

  • Degrees of freedom = n - 1 (loss of one degree of freedom due to the guess involved).

    • 20 - 1 = 19

  • Table usage:

    • Locate the 95% under confidence level column

    • Find the intersection with 19 degrees of freedom.

    • T-score: Approximately 2.093.

Using Calculator for T-Scores

  • Function: inverse t.

    • Input the left tail, 0.025, and the degrees of freedom, 19.

T-Distribution characteristics

  • Bell shaped

  • Symmetric around the mean, median, mode: 0 at center

  • Approaches the axis but doesn't touch it.

  • Standard Deviation Variance is not 1

  • Standard normal: mean is 0, standard deviation: 1.

Problem Solving Techniques

  • Construct Confidence Intervals

  • Calculator function is very easy; Problems are typically 15 seconds long

  • Raw Data:

    • Put Raw data into the list

    • Go to Stat, Tests, Option 8

Confidence Level

  • Level Column; sample size is 10

  • N (Sample Size) = 10

  • Degrees of freedom will be 10 - 1 = 9

  • T Score: 2.262

Important Notes

  • Encouraging the use of the t-interval

  • T Distribution: Bell-Shaped, Symmetric, Centered at Zero

  • Calculator More Accurate Than Table Because It Uses the Exact Value Of T Score Compared to Estimations With Table

  • T Interval is Fast and Accurate, Whereas Formula Is Slower