GES 500 Engineering Statistics: Confidence Intervals

7.1. Interval estimation: Confidence Level and Confidence Interval

  • Confidence Level and Confidence Interval

    • For a variable with a known distribution, we define the confidence level and confidence interval.

    • Given a random variable X with a known cumulative distribution function (CDF) F(x) and parameter \alpha where 0 \leq \alpha \leq 1.

    • We want to find an interval [a, b] where the probability of X falling within this interval is 1 - \alpha, expressed as:
      P(a \leq X \leq b) = 1 - \alpha.

    • This interval is called the confidence interval corresponding to the confidence level A, where A = 100(1 - \alpha)\%.

    • Typically, small \alpha values (e.g., \alpha \approx 0.01) and high A values (e.g., A \approx 100\%) are sought, implying most values of X can be found in the interval.

  • Solving Confidence Interval Initial Setup

    • The solution of the probability equation P(a \leq X \leq b) = 1 - \alpha is non-unique without knowing how to distribute remaining probabilities.

    • For bounds P(a \leq X \leq b) = F(b) - F(a), we consider intervals [a, b] and [c, d] corresponding to the same confidence level.

  • Finding Boundaries of Confidence Interval

    • To compute confidence interval boundaries, solve equations:
      P(a \leq X) = F(a) = P, \quad P(b \geq X) = F(b) = P.

    • To find a unique solution, symmetry can be introduced with P(a) = \frac{\alpha}{2} and P(b) = 1 - \frac{\alpha}{2}, valid mainly for symmetric distributions like the normal distribution.

    • For non-symmetrical distributions, choosing P(a) and P(b) is less straightforward.

  • Cumulative Distribution Function Requirement

    • Effective interval estimation relies on knowing the cumulative distribution function (CDF), marking it as distinct from point estimation, which is not dependent on the distribution's form.

    • An example using a normal variable, where we possess a known mean \mu and standard deviation \sigma, entails finding a 100(1 - \alpha)\% symmetric confidence interval using standard normal distributions (solving for Z).

  • Using Standard Normal Variable

    • Introduce the standardized normal variable:
      Z = \frac{X - \mu}{\sigma}.

    • The CDF of Z implies probabilities can be evaluated through the solutions of specified equations leveraging symmetry due to the properties of normal distributions.

  • Calculation Example

    • Example value: if \alpha = 0.05, then z_{\alpha/2} = 1.96 so the 95% confidence interval is given by \mu \pm 1.96\sigma.

    • Note that this example does not involve actual data sample values, as distribution parameters \mu and \sigma are assumed known for theoretical background.

  • Practical Impact of Unknown Parameters

    • In practice, \mu and \sigma are often unknown, complicating the calculation of confidence intervals, which necessitate approximation based on sample data.

7.2. Large-Sample Confidence Intervals

  • Foundational Concepts

    • For large samples, where N o \infty, variables that are means \bar{X} converge towards normal distributions due to the Central Limit Theorem (CLT).

  • Mean Confidence Interval Calculation

    • Use properties of large samples to express the confidence interval for the mean \mu, as
      P\left(\bar{X} - z{\alpha/2}\frac{s}{\sqrt{N}} \leq \mu \leq \bar{X} + z{\alpha/2}\frac{s}{\sqrt{N}} \right) = 1 - \alpha,

    • where s represents the sample standard deviation.

  • Desired Confidence Level and Sample Size Dynamics

    • A significant aspect is balancing between confidence level and required sample size; increasing confidence levels can rapidly inflate bounds across intervals.

    • Sample size determines width w of the interval
      w = 2z_{\alpha/2}\frac{\sigma}{\sqrt{N}},

    • Thus, N must also be adjusted accordingly with respect to desired bounds on confidence level.

  • Interpretation of Confidence Intervals

    • A established confidence interval P(X \in [L, U]) = 1 - \alpha broadly signifies that with numerous repeats of the sampling method, average probability of capturing the true parameter \mu across intervals matches the defined confidence rate.

7.3. Intervals Based on a Normal Population Distribution

  • Problem Statement

    • This section considers finding confidence interval limits under situations where N is small, and the Central Limit Theorem is not functional.

    • Formula:
      P \left( T \leq \mu \leq T \right),

    • with critical values derived from Student's t-distribution if the underlying data conforms to normality.

  • Understanding Student's T-distribution

    • T established as T = \frac{X - \mu}{S/\sqrt{N}} conforms to a t-distribution with N - 1 degrees of freedom, characterized by additional spread as opposed to standard deviations.

  • Finding Exact Confidence Intervals

    • Establish critical t-values through F(t_{\alpha/2, N-1}) = 1 - \alpha/2 leading to exact confidence intervals calculated critically aligned with sample estimates.

  • Sample Size Considerations

    • Maintain flexibility across sample value sizes; the approach remains exact for any sample size under the assumption of normality.

  • Computational Steps for Estimation

    • 1. Compute critical t-value via CDF

    • 2. Establish mean \bar{X}

    • 3. Calculate sample standard deviation s

    • 4. Establish final bounds for the confidence interval through critical formulations.


This section encapsulates theoretical background, pragmatics of interval estimation highlighting significant factors influencing confidence bounds across different data scenarios.