Understandable Statistics, 13e: Chapter 5 - The Binomial Probability Distribution and Related Topics (Section 5.2)

Chapter 5: The Binomial Probability Distribution and Related Topics

5.2 Binomial Probabilities
Section Objectives

By the end of this section, one should be able to:

  • List the five defining features of a binomial experiment.

  • Compute binomial probabilities using a formula.

  • Find P(r)P(r) using the binomial table or technology.

  • Apply the binomial probability distribution to find probabilities in real-world situations.

Introduction to Probability Distributions

Section 5.1 introduced the fundamental concept of probability distributions, which encompass all possible values of a random variable along with their associated probabilities. Certain probability distributions are frequently observed in natural phenomena and real-world scenarios, earning them specific names. These distributions serve as valuable models for various situations. The Binomial Distribution is the first of these common distributions to be studied in this module.

Motivating the Binomial Distribution

Consider the following random variables; they share common characteristics:

  • The number of individuals, out of 200200, who vote 'yes' in a referendum.

  • The number of heads obtained when tossing a coin 1010 times.

  • The number of students in a class of 3030 who successfully pass a test.

Common Characteristics:

  • Each scenario involves a series of identical trials (e.g., asking individual voters, observing each coin flip, checking each student's test result).

  • Each trial yields precisely two possible outcomes (e.g., 'yes' or 'no', 'heads' or 'tails', 'pass' or 'fail').

  • The trials are independent, meaning the outcome of one trial does not influence the outcome of any other trial (e.g., one person's vote doesn't affect another's; each coin toss is independent).

These types of situations are aptly modeled using the Binomial Distribution.

Defining Features of an Experiment Following the Binomial Distribution

An experiment qualifies as a binomial experiment if it meets five specific criteria:

  1. Fixed Number of Trials (n): There must be a predetermined and constant number of trials. This number is denoted by nn.

  2. Two Possible Outcomes: Each individual trial must result in exactly one of two outcomes: a "success" (SS) or a "failure" (FF).

  3. Independent and Identical Trials: The nn trials must be independent of each other and conducted under identical conditions.

  4. Constant Probability of Success (p) and Failure (q): For each individual trial, the probability of success, denoted by pp, remains the same. Consequently, the probability of failure, denoted by qq, is also constant. Since there are only two outcomes, p+q=1p + q = 1, which implies q=1pq = 1 - p.

  5. Central Problem: The primary objective of a binomial experiment is to determine the probability of obtaining exactly rr successes out of the nn total trials.

    • Examples: What is the probability of getting exactly 33 heads when a coin is flipped 1010 times? What is the probability that 1212 or more students out of 2020 passed a test?

Notation for Binomial Distribution

If a random variable XX adheres to a binomial distribution with nn trials and a probability pp of success on any single trial, this is symbolically represented as:
XextextasciitildeB(n,p)X ext{ extasciitilde} B(n, p)
or
XextextasciitildeBin(n,p)X ext{ extasciitilde} Bin(n, p)
Here, nn and pp are referred to as the parameters of the distribution. The symbol "extextasciitildeext{ extasciitilde} " signifies "follows the distribution of."

Example 1: Confirming Binomial Distribution for a TV Game Show

Scenario: A TV game show features a wheel of fortune with 3636 equal slots, one of which is gold, awarding $50,000\$50,000. No other slot pays. The goal is to check the likelihood that 1010 out of 100100 contestants will win the prize.

Solution (Confirming Binomial Requirements):

  • a) Fixed Number of Trials (n): There are n=100n = 100 contestants, each having a trial at the wheel.

  • b) Independent Trials: Assuming the wheel is fair, each spin is independent, meaning one contestant's result does not influence another's.

  • c) Two Outcomes (Success/Failure): For each spin, the ball either lands on the gold slot (designated as "success", SS) or it does not (designated as "failure", FF). It's important to note that "success" and "failure" are arbitrary terms for the outcomes of interest and do not imply good or bad results in a general sense.

  • d) Constant Probability of Success (p) and Failure (q): The probability of success pp (landing on gold) is 1/361/36, as there is 11 gold slot out of 3636. Consequently, the probability of failure qq is 1p=1(1/36)=35/361 - p = 1 - (1/36) = 35/36.

  • e) Central Problem (r successes): We are interested in the probability of 1010 successes out of 100100 trials, so r=10r = 10.

Since all five conditions are met, the Binomial Distribution can be used to model this situation.

Example 2: Determining Binomial Parameters from a Blood Type Study

Scenario: 9%9\% of the UK population has blood type B. If 1818 people are chosen randomly and tested for blood type B, what is the probability that 33 of them have blood type B?

Solution (Determining p, q, n, r):

  • p=0.09p = 0.09 (probability of 'success' - having blood type B).

  • q=0.91q = 0.91 (probability of 'failure' - not having blood type B).

  • n=18n = 18 (fixed number of trials - people sampled).

  • r=3r = 3 (number of successes we are interested in).

This can be written as XextextasciitildeB(18,0.09)X ext{ extasciitilde} B(18, 0.09). Independence is approximated due to the small sample size relative to the entire population.

The Binomial Distribution Formula

For a binomial experiment with nn trials, a probability of success pp, and a probability of failure qq for each trial, the probability of obtaining exactly rr successes (where 0rn0 \leq r \leq n) is given by:

P(r)=Cn,rprqnrP(r) = C_{n,r} p^r q^{n-r}

Alternatively, written with the combination notation:

P(r) = inom{n}{r} p^r q^{n-r}

where Cn,r=n!r!(nr)!C_{n,r} = \frac{n!}{r!(n-r)!}.

Breakdown of the Formula:

  • prp^r: Represents the probability of achieving rr successes.

  • qnrq^{n-r}: Represents the probability of achieving nrn-r failures.

  • C<em>n,rC<em>{n,r} (or inom{n}{r} ): This is the number of ways to arrange rr successes and nrn-r failures within nn trials. For example, if tossing a coin 44 times and wanting 22 heads, the possible orderings (HHTT, HTHT, HTTH, THHT, THTH, TTHH) are 66 ways, which is equivalent to C</em>4,2=4!2!(42)!=242×2=6C</em>{4,2} = \frac{4!}{2!(4-2)!} = \frac{24}{2 \times 2} = 6.

This formula is a probability function because it provides the probability for each possible value in the sample space.

Example 3: Calculating Probability of Online Privacy Concern

Scenario: 59%59\% of people are concerned about the confidentiality of their online personal information. For a random sample of 1010 people, what is the probability that exactly 66 are concerned?

Solution:
This is a binomial experiment with:

  • n=10n = 10 (number of trials).

  • p=0.59p = 0.59 (probability of success - being concerned).

  • q=0.41q = 0.41 (probability of failure - not being concerned).

  • r=6r = 6 (number of successes we are interested in).

Using the formula:
P(6)=C10,6(0.59)6(0.41)106P(6) = C_{10,6} (0.59)^6 (0.41)^{10-6}
P(6)=210(0.59)6(0.41)4P(6) = 210 (0.59)^6 (0.41)^4
P(6)210(0.0423012)(0.00282576)P(6) \approx 210 (0.0423012) (0.00282576)
P(6)0.250P(6) \approx 0.250

There is approximately a 25%25\% chance that exactly 66 out of 1010 randomly sampled people are concerned about their online privacy.

Example 4: Tomato Seed Germination

Scenario: A biologist studies a hybrid tomato, whose seeds have a 0.700.70 probability of germinating. The biologist plants 66 seeds.

a) Probability that exactly four seeds will germinate:
Solution:
This is a binomial experiment with:

  • n=6n = 6 (number of trials - seeds planted).

  • p=0.70p = 0.70 (probability of success - germination).

  • q=0.30q = 0.30 (probability of failure - no germination).

  • r=4r = 4 (number of successes we are interested in).

Using the formula:
P(4)=C6,4(0.70)4(0.30)64P(4) = C_{6,4} (0.70)^4 (0.30)^{6-4}
P(4)=15(0.70)4(0.30)2P(4) = 15 (0.70)^4 (0.30)^2
P(4)=15(0.2401)(0.09)P(4) = 15 (0.2401) (0.09)
P(4)0.3241P(4) \approx 0.3241

b) Probability that at least four seeds will germinate:
Solution:
"At least four seeds" means r4r \geq 4, which includes r=4r=4, r=5r=5, or r=6r=6 successes. Since these events are mutually exclusive, we use the addition rule:
P(r4)=P(4)+P(5)+P(6)P(r \geq 4) = P(4) + P(5) + P(6)

While P(4)P(4) has already been calculated (0.32410.3241), calculating P(5)P(5) and P(6)P(6) manually can be tedious. Instead, we can utilize a binomial table (e.g., Appendix II, Table 3).

To find P(5)P(5) and P(6)P(6) from the table, locate the row for n=6n=6 trials. Then, find the column for p=0.70p=0.70. In this column:

  • For r=5r=5, the table gives P(5)=0.303P(5) = 0.303.

  • For r=6r=6, the table gives P(6)=0.118P(6) = 0.118.

Now, sum the probabilities:
P(r4)=0.324+0.303+0.118=0.745P(r \geq 4) = 0.324 + 0.303 + 0.118 = 0.745

Critical Thinking Activity: Using the Complement Rule

The complement rule of probability, P(Ac)=1P(A)P(A^c) = 1 - P(A), is a powerful tool to simplify binomial probability computations, especially when dealing with a range of successes. For a binomial experiment with n=7n=7 trials, the sample space for the number of successes rr is 0,1,2,3,4,5,6,7{0, 1, 2, 3, 4, 5, 6, 7}.

Example: To find P(1r7)P(1 \leq r \leq 7) (i.e., at least 11 success), one could sum P(1),P(2),,P(7)P(1), P(2), \dots, P(7). However, it is more efficient to recognize that r=0r = 0 successes is the complement of r1r \geq 1 (or 1r71 \leq r \leq 7) successes. By the complement rule:
P(r1)=1P(r=0)P(r \geq 1) = 1 - P(r = 0)
Calculating or looking up P(r=0)P(r=0) and subtracting from 11 is significantly faster than calculating seven individual probabilities.

Application Examples with n=7:

  • i. P(r2)P(r \geq 2): This includes r=2,3,4,5,6,7r = 2, 3, 4, 5, 6, 7. The complement is P(r \textless 2) = P(0) + P(1) . It is more efficient to calculate the complement (22 terms) than the original probability (66 terms).

  • ii. P(r > 2) : This includes r=3,4,5,6,7r = 3, 4, 5, 6, 7. The complement is P(r2)=P(0)+P(1)+P(2)P(r \leq 2) = P(0) + P(1) + P(2). The original probability (55 terms) is less efficient to calculate than the complement (33 terms).

  • iii. P(r5)P(r \leq 5): This includes r=0,1,2,3,4,5r = 0, 1, 2, 3, 4, 5. The complement is P(r > 5) = P(6) + P(7) . It is more efficient to calculate the complement (22 terms) than the original probability (66 terms).

  • iv. P(r < 5) : This includes r=0,1,2,3,4r = 0, 1, 2, 3, 4. The complement is P(r5)=P(5)+P(6)+P(7)P(r \geq 5) = P(5) + P(6) + P(7). Both involve a similar number of terms, but the complement (33 terms) might be slightly more efficient than the original (55 terms), depending on the specific values.

What a Binomial Probability Distribution Tells Us

In summary, a binomial probability distribution provides critical information about experiments involving repeated trials:

  • Sample Space: The sample space for the number of possible successes, rr, out of a fixed number of binomial trials, nn, is 0,1,2,,n{0, 1, 2, \dots, n}.

  • Trial Characteristics: Trials are independent and conducted under identical conditions.

  • Outcome Dichotomy: Each trial has only two possible outcomes: success (SS) or failure (FF).

  • Constant Probabilities: The probability of success (pp) remains constant for each trial. The probability of failure (qq) is also constant, with q=1pq = 1 - p.

  • Probability of r Successes: The distribution provides the values of P(r)P(r), representing the probability of achieving exactly rr successes out of nn trials, for every rr from 00 to nn.

  • Dependence on Parameters: The calculation of P(r)P(r) is dependent on three inputs: the probability of success pp, the specific number of successes sought rr, and the total number of trials nn.