Understandable Statistics, 13e: Chapter 5 - The Binomial Probability Distribution and Related Topics (Section 5.2)

Chapter 5: The Binomial Probability Distribution and Related Topics

5.2 Binomial Probabilities

Section Objectives

By the end of this section, one should be able to:

List the five defining features of a binomial experiment.
Compute binomial probabilities using a formula.
Find P(r) using the binomial table or technology.
Apply the binomial probability distribution to find probabilities in real-world situations.

Introduction to Probability Distributions

Section 5.1 introduced the fundamental concept of probability distributions, which encompass all possible values of a random variable along with their associated probabilities. Certain probability distributions are frequently observed in natural phenomena and real-world scenarios, earning them specific names. These distributions serve as valuable models for various situations. The Binomial Distribution is the first of these common distributions to be studied in this module.

Motivating the Binomial Distribution

Consider the following random variables; they share common characteristics:

The number of individuals, out of 200 , who vote 'yes' in a referendum.
The number of heads obtained when tossing a coin 10 times.
The number of students in a class of 30 who successfully pass a test.

Common Characteristics:

Each scenario involves a series of identical trials (e.g., asking individual voters, observing each coin flip, checking each student's test result).
Each trial yields precisely two possible outcomes (e.g., 'yes' or 'no', 'heads' or 'tails', 'pass' or 'fail').
The trials are independent, meaning the outcome of one trial does not influence the outcome of any other trial (e.g., one person's vote doesn't affect another's; each coin toss is independent).

These types of situations are aptly modeled using the Binomial Distribution.

Defining Features of an Experiment Following the Binomial Distribution

An experiment qualifies as a binomial experiment if it meets five specific criteria:

Fixed Number of Trials (n): There must be a predetermined and constant number of trials. This number is denoted by n .
Two Possible Outcomes: Each individual trial must result in exactly one of two outcomes: a "success" ( S ) or a "failure" ( F ).
Independent and Identical Trials: The n trials must be independent of each other and conducted under identical conditions.
Constant Probability of Success (p) and Failure (q): For each individual trial, the probability of success, denoted by p , remains the same. Consequently, the probability of failure, denoted by q , is also constant. Since there are only two outcomes, p + q = 1 , which implies q = 1 - p .
Central Problem: The primary objective of a binomial experiment is to determine the probability of obtaining exactly r successes out of the n total trials.
- Examples: What is the probability of getting exactly 3 heads when a coin is flipped 10 times? What is the probability that 12 or more students out of 20 passed a test?

Notation for Binomial Distribution

If a random variable X adheres to a binomial distribution with n trials and a probability p of success on any single trial, this is symbolically represented as:
X ext{ extasciitilde} B(n, p)
or
X ext{ extasciitilde} Bin(n, p)
Here, n and p are referred to as the parameters of the distribution. The symbol " ext{ extasciitilde} " signifies "follows the distribution of."

Example 1: Confirming Binomial Distribution for a TV Game Show

Scenario: A TV game show features a wheel of fortune with 36 equal slots, one of which is gold, awarding \$50,000 . No other slot pays. The goal is to check the likelihood that 10 out of 100 contestants will win the prize.

Solution (Confirming Binomial Requirements):

a) Fixed Number of Trials (n): There are n = 100 contestants, each having a trial at the wheel.
b) Independent Trials: Assuming the wheel is fair, each spin is independent, meaning one contestant's result does not influence another's.
c) Two Outcomes (Success/Failure): For each spin, the ball either lands on the gold slot (designated as "success", S ) or it does not (designated as "failure", F ). It's important to note that "success" and "failure" are arbitrary terms for the outcomes of interest and do not imply good or bad results in a general sense.
d) Constant Probability of Success (p) and Failure (q): The probability of success p (landing on gold) is 1/36 , as there is 1 gold slot out of 36 . Consequently, the probability of failure q is 1 - p = 1 - (1/36) = 35/36 .
e) Central Problem (r successes): We are interested in the probability of 10 successes out of 100 trials, so r = 10 .

Since all five conditions are met, the Binomial Distribution can be used to model this situation.

Example 2: Determining Binomial Parameters from a Blood Type Study

Scenario: 9\% of the UK population has blood type B. If 18 people are chosen randomly and tested for blood type B, what is the probability that 3 of them have blood type B?

Solution (Determining p, q, n, r):

p = 0.09 (probability of 'success' - having blood type B).
q = 0.91 (probability of 'failure' - not having blood type B).
n = 18 (fixed number of trials - people sampled).
r = 3 (number of successes we are interested in).

This can be written as X ext{ extasciitilde} B(18, 0.09) . Independence is approximated due to the small sample size relative to the entire population.

The Binomial Distribution Formula

For a binomial experiment with n trials, a probability of success p , and a probability of failure q for each trial, the probability of obtaining exactly r successes (where 0 \leq r \leq n ) is given by:

P(r) = C_{n,r} p^r q^{n-r}

Alternatively, written with the combination notation:

P(r) = inom{n}{r} p^r q^{n-r}

where C_{n,r} = \frac{n!}{r!(n-r)!} .

Breakdown of the Formula:

p^r : Represents the probability of achieving r successes.
q^{n-r} : Represents the probability of achieving n-r failures.
C{n,r} (or inom{n}{r} ): This is the number of ways to arrange r successes and n-r failures within n trials. For example, if tossing a coin 4 times and wanting 2 heads, the possible orderings (HHTT, HTHT, HTTH, THHT, THTH, TTHH) are 6 ways, which is equivalent to C{4,2} = \frac{4!}{2!(4-2)!} = \frac{24}{2 \times 2} = 6 .

This formula is a probability function because it provides the probability for each possible value in the sample space.

Example 3: Calculating Probability of Online Privacy Concern

Scenario: 59\% of people are concerned about the confidentiality of their online personal information. For a random sample of 10 people, what is the probability that exactly 6 are concerned?

Solution:
This is a binomial experiment with:

n = 10 (number of trials).
p = 0.59 (probability of success - being concerned).
q = 0.41 (probability of failure - not being concerned).
r = 6 (number of successes we are interested in).

Using the formula:
P(6) = C_{10,6} (0.59)^6 (0.41)^{10-6}
P(6) = 210 (0.59)^6 (0.41)^4
P(6) \approx 210 (0.0423012) (0.00282576)
P(6) \approx 0.250

There is approximately a 25\% chance that exactly 6 out of 10 randomly sampled people are concerned about their online privacy.

Example 4: Tomato Seed Germination

Scenario: A biologist studies a hybrid tomato, whose seeds have a 0.70 probability of germinating. The biologist plants 6 seeds.

a) Probability that exactly four seeds will germinate:
Solution:
This is a binomial experiment with:

n = 6 (number of trials - seeds planted).
p = 0.70 (probability of success - germination).
q = 0.30 (probability of failure - no germination).
r = 4 (number of successes we are interested in).

Using the formula:
P(4) = C_{6,4} (0.70)^4 (0.30)^{6-4}
P(4) = 15 (0.70)^4 (0.30)^2
P(4) = 15 (0.2401) (0.09)
P(4) \approx 0.3241

b) Probability that at least four seeds will germinate:
Solution:
"At least four seeds" means r \geq 4 , which includes r=4 , r=5 , or r=6 successes. Since these events are mutually exclusive, we use the addition rule:
P(r \geq 4) = P(4) + P(5) + P(6)

While P(4) has already been calculated ( 0.3241 ), calculating P(5) and P(6) manually can be tedious. Instead, we can utilize a binomial table (e.g., Appendix II, Table 3).

To find P(5) and P(6) from the table, locate the row for n=6 trials. Then, find the column for p=0.70 . In this column:

For r=5 , the table gives P(5) = 0.303 .
For r=6 , the table gives P(6) = 0.118 .

Now, sum the probabilities:
P(r \geq 4) = 0.324 + 0.303 + 0.118 = 0.745

Critical Thinking Activity: Using the Complement Rule

The complement rule of probability, P(A^c) = 1 - P(A) , is a powerful tool to simplify binomial probability computations, especially when dealing with a range of successes. For a binomial experiment with n=7 trials, the sample space for the number of successes r is {0, 1, 2, 3, 4, 5, 6, 7} .

Example: To find P(1 \leq r \leq 7) (i.e., at least 1 success), one could sum P(1), P(2), \dots, P(7) . However, it is more efficient to recognize that r = 0 successes is the complement of r \geq 1 (or 1 \leq r \leq 7 ) successes. By the complement rule:
P(r \geq 1) = 1 - P(r = 0)
Calculating or looking up P(r=0) and subtracting from 1 is significantly faster than calculating seven individual probabilities.

Application Examples with n=7:

i. P(r \geq 2) : This includes r = 2, 3, 4, 5, 6, 7 . The complement is P(r \textless 2) = P(0) + P(1) . It is more efficient to calculate the complement ( 2 terms) than the original probability ( 6 terms).
ii. P(r > 2) : This includes r = 3, 4, 5, 6, 7 . The complement is P(r \leq 2) = P(0) + P(1) + P(2) . The original probability ( 5 terms) is less efficient to calculate than the complement ( 3 terms).
iii. P(r \leq 5) : This includes r = 0, 1, 2, 3, 4, 5 . The complement is P(r > 5) = P(6) + P(7) . It is more efficient to calculate the complement ( 2 terms) than the original probability ( 6 terms).
iv. P(r < 5) : This includes r = 0, 1, 2, 3, 4 . The complement is P(r \geq 5) = P(5) + P(6) + P(7) . Both involve a similar number of terms, but the complement ( 3 terms) might be slightly more efficient than the original ( 5 terms), depending on the specific values.

What a Binomial Probability Distribution Tells Us

In summary, a binomial probability distribution provides critical information about experiments involving repeated trials:

Sample Space: The sample space for the number of possible successes, r , out of a fixed number of binomial trials, n , is {0, 1, 2, \dots, n} .
Trial Characteristics: Trials are independent and conducted under identical conditions.
Outcome Dichotomy: Each trial has only two possible outcomes: success ( S ) or failure ( F ).
Constant Probabilities: The probability of success ( p ) remains constant for each trial. The probability of failure ( q ) is also constant, with q = 1 - p .
Probability of r Successes: The distribution provides the values of P(r) , representing the probability of achieving exactly r successes out of n trials, for every r from 0 to n .
Dependence on Parameters: The calculation of P(r) is dependent on three inputs: the probability of success p , the specific number of successes sought r , and the total number of trials n .