Chapter 16 - Geometric and Binomial Probability Models

Introduction to Probability Models

  • Foundational Context: The study of Geometric and Binomial models falls under the broader category of "Modeling the World."
  • The Role of Conditions: Statistical models are not universal; they are built upon specific sets of conditions or assumptions. If these conditions are not met, the model may not be appropriate for the data.     * Normal Model: Requires that data be unimodal and symmetric.     * Linear Model: Requires that the data exhibits a linear relationship.     * Geometric and Binomial Models: These models are built upon the foundation of Bernoulli trials.

Bernoulli Trials

  • Definition: The Bernoulli trial is the fundamental basis for the probability models examined in this chapter.
  • Criteria for Bernoulli Trials: To qualify as Bernoulli trials, the following three conditions must be satisfied:     1. Two Possible Outcomes: There must be exactly two possible outcomes for each trial, categorized as "success" and "failure."     2. Constant Probability (pp): The probability of success, denoted by pp, must remain constant across all trials. Consequently, the probability of failure (qq) is also constant (q=1pq = 1 - p).     3. Independent Trials: The outcome of one trial must not influence or change the outcome or probability of any other trial.

Independence and the 10% Condition

  • The Independence Requirement: A central assumption for Bernoulli trials is that the trials are independent of one another.
  • Challenges with Finite Populations: In practice, when sampling from a population that is not infinite, removing individuals for the sample changes the probability for subsequent trials, meaning the trials are technically not independent.
  • The Rule of Thumb (The 10% Condition):     * Specific Rule: It is acceptable to "pretend" that trials are independent and proceed with Bernoulli-based models even if the true independence assumption is technically violated, provided the sample size is small.     * The Threshold: The sample must be smaller than 10%10\% of the total population.     * Implication: Keeping the sample under 10%10\% ensures that the probability pp remains approximately constant enough that the trials can be treated as independent for calculation purposes.

The Geometric Probability Model

  • Model Designation: Represented as Geom(p)Geom(p), where pp is the probability of success.
  • Core Objective: The Geometric model is used to find the probability of the number of trials required until the first success occurs.
  • Variable Definitions:     * pp: The probability of a success.     * qq: The probability of a failure, where q=1pq = 1 - p.     * XX: The random variable representing the number of trials until the first success occurs.
  • The "Failure Until Final Success" Logic: The model describes a sequence of events where a series of failures occurs, concluding with a single, final success.
  • Probability Mass Function (PMF):     * The probability that the first success occurs on the xx-th trial is calculated as:         P(X=x)=qx1pP(X = x) = q^{x-1} p     * This can also be written substituting for qq:         P(X=x)=(1p)x1pP(X = x) = (1 - p)^{x-1} p     * Conceptual Breakdown: The formula signifies (Probability of Failure)x1×(Probability of Success)1(\text{Probability of Failure})^{x-1} \times (\text{Probability of Success})^1.

Statistics of the Geometric Model

  • Mean (μ\mu):     * Also known as the Expected Value (E(x)E(x)).     * Formula: μ=E(x)=1p\mu = E(x) = \frac{1}{p}.     * Definition: This represents the expected number of trials or "picks" required until you achieve your first success.
  • Standard Deviation (σ\sigma):     * Formula: σ=qp\sigma = \frac{\sqrt{q}}{p}.