Confidence Intervals (Part II)

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/47

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

48 Terms

New cards

t = (X̄ - μ)/(s/((n)^1/2))

t-score formula

New cards

Use the z-score

σ is known OR
n is huge and σ is credibly known from process history

New cards

Use the t-score

σ is unknown and data are roughly normal
OR n ≥ ~30

New cards

Degrees of Freedom (df)

The number of independent values in a data set that are free to vary when estimating a parameter.

Used in t-distribution calculations

Represents the number of independent pieces of information available to estimate variability after one parameter (the mean) has been estimated from the data.

New cards

df = n - 1

Formulas for df

New cards

t_(α/2, n - 1)

Symbol that represents the t-critical value that cuts off an area of α/2 in each tail of the t-distribution with n-1 degrees of freedom.

Determines how far the sample mean can fall from the true mean when constructing a confidence interval for small samples.

New cards

X ± t_(α/2, n - 1) (s/((n)^1/2))

Formula for one-sample t confidence interval

New cards

The Assumptions that a Sample must meet for the t-confidence interval

Sample for the t confidence interval, should have the distribution that is not extremely skewed and should not have any extreme outliers (or n larger). Must check plots (histogram/QQ) and context.

New cards

Bootstrap Confidence Interval

A method for estimating the uncertainty of a statistic (like the mean or median) by resampling the original data many times with replacement and recalculating the statistic for each resample.

New cards

trimmed mean

For the moderate skew/outliers consider a ____________ or transform or use a Bootstrap Confidence Interval (if allowed)

New cards

Yes, it is true

Is it true that you must never “trim” outliers just to shrink the intervals, only use it for the documented errors

New cards

Margin of Error (MOE)

Half-width of a Confidence Interval

New cards

E = (z_(α/2))(σ/((n)^1/2))

Margin of Error of a z confidence interval

New cards

E = (t_(α/2))(s/((n)^1/2))

Margin of Error of a t confidence interval

New cards

n = ((z_(α/2)σ)/E)^2

Formula for the sample size (n) when estimating a population mean using a z-confidence interval, needed to achieve a desired margin of error (E)

Always round up n to the next whole number
Use pilot’s study or historical population if σ is unknown.
The formula ensures the confidence interval has the specified precision (E) at the chosen confidence level (1 - α)

New cards

Pilot Study Sample Standard Deviation (s)

A small, preliminary study done before the main one
If the population standard deviation is unknown, the sample standard deviation (s) calculated from the preliminary data
Gets used as an estimate of σ in the following sample size formula
n = ((z_(α/2)s)/E)^2
Gives a realistic idea of how variable the data are,
Helping plan how many samples are needed in the full study.

New cards

Historical Population Standard Deviation (σ)

Also called published value of the population standard deviation (σ)
Example: from past experiments, industry data, or technical reports as an estimate or variability.
This approach assumes the new data behave similarly to the older or related data.

New cards

Finite Population Correction (FPC)

The correction that prevents overestimating the variability when a large portion of the population is sampled.

When sampling without replacement from a finite population of size N, the variability of the sample mean is slightly smaller than in infinite populations.

New cards

(σ_(X̄, FPC)) = (σ/((n)^1/2))((N - n)/(N - 1)^1/2

Finite Population Correction formula.

Used if the sampling fraction n/N > 0.05

Adjusts the Standard Error

New cards

Sample Proportion

Represents the fraction of successes in a sample.

New cards

Symbol that represents the number of successes in the sample proportion

New cards

p̂ = X/n

Formula for the sample proportion, which is used as an estimate of the population proportion p.

New cards

Wilson Interval

Used when the sample size is small or
When np̂ and n(1 - p̂) < 10,
where the normal approximation is unreliable
Adjusts both the center (mean estimate and width of the confidence interval
To provide a more accurate estimate of the population proportion p for small samples
Tends to produce intervals that are tighter and more balanced around the true proportion than the standard normal-based method.

New cards

Agresti-Coull Interval

Used when the sample size is small or
When np̂ and n(1 - p̂) < 10,
Improves accuracy by adding small correction;
Which is usually 2 artificial successes and 2 artificial failures;
Done before computing p̂.
Increases stability in the estimated proportion and generally provides better coverage probability than the large-sample (normal) confidence interval.

New cards

Clopper-Pearson Interval

An exact binomial confidence interval
Used when the normal approximation doesn’t hold.
Guarantees that the true confidence level is at least what is stated
(never underestimates coverage)
Often conservative, meaning the interval is wider than necessary
But ensures high reliability for small or discrete samples.

New cards

p̂ ± (z_(α/2))((p̂(1 - p))/n)^1/2

Large-Sample Confidence Interval for a Proportion (p)

New cards

np̂ ≥ 10 and n(1 - p̂) ≥ 10

Conditions when the sample is large and in order to ensure that the sampling distribution of p̂ is approximately normal

New cards

two-sided

Use _________ Confidence Intervals unless only a minimum or maximum matters (spec/specification)

New cards

100(1 - α)%: X̄ - z_α SE

One sided lower bound

New cards

100(1 - α)%: X̄ + z_α SE

One sided upper bound

New cards

Paired Sample

Same units are measured twice (before/after, left/right)
Analyze differences: D_i: Confidence Interval for mean of D using t

New cards

Two-sample

Used when there are independent groups.

New cards

Pooled t-Test

A two-sample t-test used when the population variances are approximately equal.
Combines (or “pools”) the two samples
Into a single, common estimate of variance to compute the standard error
This increases precision when the equal-variance assumptions holds.

New cards

(s_p)^2 = (((n1 - 1)(s1)^2) + ((n2 - 1)(s2)^2))/(n1 + n2 - 2)

Pooled variance formula

New cards

Welch t-Test

A two sample t-test used when the population variances are not equal (heterogenous variances).
Does not assumes equal variances
And instead adjusts both the standard error and degrees of freedom accordingly
Is a more robust and reliable version when sample sizes or variances differ.

New cards

Q-Q Plot (Quantile-Quantile Plot)

A graphical tool used to check whether a dataset follows a specified distribution (most commonly the normal distribution)
Plots the quantiles of the sample data against the quantiles of a theoretical normal distribution.

New cards

straight diagonal line

In a QQ plot, if the points fall roughly along a ____________________, the data are appropriately normal.

New cards

skewness, non-normality

Systematic curves of deviations inside a QQ plot indicate ___________ or ____________

New cards

Multiplicative Data

Dataset where the peaks and troughs of the pattern become larger as the trend increases

New cards

Data Transformation

A mathematical modification applied to each data point to make the data more normal, stabilize variance, or improve model fit.

New cards

Count Data

Dataset consisting of non-negative, integer values that represent the number of times an event occurs within a specific unit of time or space

New cards

log(x)

Transformation used for right-skewed or multiplicative data

New cards

x^1/2 (square root)

Transformation used for count data

New cards

1/x (reciprocal)

Transformation used for the strong right-skew

New cards

n ≥ 30, Central Limit Theorem

If _______, then the t-interval works well due to the _____________________, unless the data have extreme Skewness or outlier

New cards

n < 30, roughly symmetric

If ______, inspect the histogram or QQ plot.
If the data are _________________, proceed with t.
If highly skewed, then try a data transformation.

New cards

SE Mean

The estimated standard deviation of the sample mean (s/(n)^1/2), also known as the standard error of the mean.

New cards

X̄ ± (critical value) × (SE Mean)

Equation for the Endpoints of the Confidence Interval