Confidence Intervals and Sample Sizes
Handout Overview
The handout has three columns:
Section 7.2
Section 7.3 (today's topic)
Section 7.1
Sections 7.2 and 7.3 concern means and numerical data, estimating the average or mean.
The primary difference between sections 7.2 and 7.3 will be the availability of the population standard deviation.
Section 7.2: Minimum Sample Size
We didn't complete Section 7.2, specifically how to calculate the minimum sample size.
Formula is based on the margin of error:
Margin \space of \space Error = z * (Standard \space Deviation / \sqrt{n})We solve for n in this equation.
The formula includes a z-score, determined by the desired confidence level.
For 90% confidence:
Sampling distribution is considered.
90% lies in the middle, bounded by lower and upper limits.
In terms of x, these are lower and upper bounds; in terms of z, negative and positive z values for standard normal distribution.
Finding Z-Scores
Use inverse norms to find Z-scores.
For 90% confidence:
90% inside, remainder outside.
100 - 90 = 10
10% is outside, split in half.
10 / 2 = 5
5% on each tail.
Negative z-score calculation:
Inverse \space Norm(left \space tail, mean, standard \space deviation)
Inverse \space Norm(0.05, 0, 1)
Result: z-score = -1.645.
A table provides pre-calculated z-scores for common confidence levels.
For 90% confidence, the z-score of 1.645 is readily available on the table.
Using the table is generally easier than calculating inverse norms.
Minimum Sample Size Formula
Formula:
n = (z * Standard \space Deviation / Margin \space of \space Error)^2Aim is to determine the necessary amount of data to collect.
Example: Estimating the amount of money people spend at a store.
Margin of error:
Indicates desired accuracy.
Keyword: "within".
Example: "Accurate to within $1" implies margin of error is 1.
Example Problem (Number 8)
Estimate the mean number of minutes (confidence interval for average).
Determine the required sample size.
Using the sample size formula
Given:
95% confidence
Margin of error: 7 minutes
Standard deviation: 47
95% confidence implies z-score = 1.96.
Plugging in values:
n = (1.96 * 47 / 7)^2Result: n = 173.18
Round up to 174, as you cannot sample a fraction of a person.
Verifying Z-Score with Table
For 95% confidence, the table confirms the z-score of 1.96.
Alternatively, calculate using inverse norm with a left tail of 0.025.
Homework Questions: Problems 7.1 and 7.2
Problem Seven
Problem Seven Set-Up
Sample of 68 corridors, mean of 61.2 decibels, population standard deviation of 9.7.
Construct a 95% confidence interval.
This is a confidence interval for the mean.
Differentiating Z and T Interval
Z-interval (7.2 question): population standard deviation is known.
T-interval (7.3 question): will be discussed later.
Since this problem provides population standard deviation, use a z-interval.
Z-Interval Procedure
Using calculator, go to Stat, then Tests, and choose option 7 (Z-interval).
Input:
Summary Stats
Population \space Standard \space Deviation = 9.7
Mean = 61.2
Sample \space Size = 68
Confidence \space Level = 95
Enter the values into calculator. And the Results are:
Lower bond: 58.9
Upper bond: 63.5
Using Raw Data: Problem Number 5
Enter raw data into a list (Stat, Edit, option 1).
Go to Stat, Tests, option 7 (Z-interval).
Select "Data" instead of "Stats."
Input population standard deviation (e.g., 2.1).
Specify the list where data is entered (e.g., L2).
Set frequency to 1.
Choose confidence level (e.g., 99%).
Calculate. You get the results.
Example interpretation: First-time married couples stay together between roughly 5.4 and 7.4 years.
Section 7.3
Focus: Constructing confidence intervals.
No minimum sample size calculations.
Two methods:
Formula
Calculator function
Calculator function is similar to 7.2, but uses a different function
Rationale: Lack of population standard deviation.
Instead, use sample standard deviation.
Creates more uncertainty because sample standard deviation is a guess of population value.
T-Distribution
Incorporates added uncertainty.
Formula: x \space bar +- t * (s/ \sqrt{n}) * Instead of population standard deviation, use the sample standard deviation.
Instead of Z-score, use the t-score.
T-Distribution Table
Provides t-scores.
Example: 95% confidence interval, sample size of 20.
Degrees of freedom = n - 1 (loss of one degree of freedom due to the guess involved).
20 - 1 = 19
Table usage:
Locate the 95% under confidence level column
Find the intersection with 19 degrees of freedom.
T-score: Approximately 2.093.
Using Calculator for T-Scores
Function: inverse t.
Input the left tail, 0.025, and the degrees of freedom, 19.
T-Distribution characteristics
Bell shaped
Symmetric around the mean, median, mode: 0 at center
Approaches the axis but doesn't touch it.
Standard Deviation Variance is not 1
Standard normal: mean is 0, standard deviation: 1.
Problem Solving Techniques
Construct Confidence Intervals
Calculator function is very easy; Problems are typically 15 seconds long
Raw Data:
Put Raw data into the list
Go to Stat, Tests, Option 8
Confidence Level
Level Column; sample size is 10
N (Sample Size) = 10
Degrees of freedom will be 10 - 1 = 9
T Score: 2.262
Important Notes
Encouraging the use of the t-interval
T Distribution: Bell-Shaped, Symmetric, Centered at Zero
Calculator More Accurate Than Table Because It Uses the Exact Value Of T Score Compared to Estimations With Table
T Interval is Fast and Accurate, Whereas Formula Is Slower