Chapter 16 - Geometric and Binomial Probability Models
Introduction to Probability Models
Foundational Context: The study of Geometric and Binomial models falls under the broader category of "Modeling the World."
The Role of Conditions: Statistical models are not universal; they are built upon specific sets of conditions or assumptions. If these conditions are not met, the model may not be appropriate for the data.
* Normal Model: Requires that data be unimodal and symmetric.
* Linear Model: Requires that the data exhibits a linear relationship.
* Geometric and Binomial Models: These models are built upon the foundation of Bernoulli trials.
Bernoulli Trials
Definition: The Bernoulli trial is the fundamental basis for the probability models examined in this chapter.
Criteria for Bernoulli Trials: To qualify as Bernoulli trials, the following three conditions must be satisfied:
1. Two Possible Outcomes: There must be exactly two possible outcomes for each trial, categorized as "success" and "failure."
2. Constant Probability (p): The probability of success, denoted by p, must remain constant across all trials. Consequently, the probability of failure (q) is also constant (q=1−p).
3. Independent Trials: The outcome of one trial must not influence or change the outcome or probability of any other trial.
Independence and the 10% Condition
The Independence Requirement: A central assumption for Bernoulli trials is that the trials are independent of one another.
Challenges with Finite Populations: In practice, when sampling from a population that is not infinite, removing individuals for the sample changes the probability for subsequent trials, meaning the trials are technically not independent.
The Rule of Thumb (The 10% Condition):
* Specific Rule: It is acceptable to "pretend" that trials are independent and proceed with Bernoulli-based models even if the true independence assumption is technically violated, provided the sample size is small.
* The Threshold: The sample must be smaller than 10% of the total population.
* Implication: Keeping the sample under 10% ensures that the probability p remains approximately constant enough that the trials can be treated as independent for calculation purposes.
The Geometric Probability Model
Model Designation: Represented as Geom(p), where p is the probability of success.
Core Objective: The Geometric model is used to find the probability of the number of trials required until the first success occurs.
Variable Definitions:
* p: The probability of a success.
* q: The probability of a failure, where q=1−p.
* X: The random variable representing the number of trials until the first success occurs.
The "Failure Until Final Success" Logic: The model describes a sequence of events where a series of failures occurs, concluding with a single, final success.
Probability Mass Function (PMF):
* The probability that the first success occurs on the x-th trial is calculated as:
P(X=x)=qx−1p
* This can also be written substituting for q:
P(X=x)=(1−p)x−1p
* Conceptual Breakdown: The formula signifies (Probability of Failure)x−1×(Probability of Success)1.
Statistics of the Geometric Model
Mean (μ):
* Also known as the Expected Value (E(x)).
* Formula: μ=E(x)=p1.
* Definition: This represents the expected number of trials or "picks" required until you achieve your first success.