Probability and Statistics: Discrete Probability Distributions
Binomial and Multinomial Distributions
An experiment often consists of repeated trials.
Each trial has two possible outcomes: success or failure.
We can define either outcome as a success.
This process is called a Bernoulli process.
Each trial is called a Bernoulli trial.
Binomial Distribution
The number of successes in Bernoulli trials is a binomial random variable.
The probability distribution of this discrete random variable is the binomial distribution, denoted by , dependent on the number of trials and the probability of success on a given trial.
Defined as the probability of successes in independent trials.
A Bernoulli trial can result in a success with probability and a failure with probability .
The probability distribution of the binomial random variable is:
The Bernoulli Process
Consider selecting three items at random from a manufacturing process, inspecting them, and classifying them as defective (success) or nondefective.
The number of successes (defective items) is a random variable with values from 0 to 3.
Example
If the process produces 25% defectives, then:
The probability distribution of is:
x | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
f(x) |
Example: Binomial Distribution
The probability that a component survives a shock test is .
Find the probability that exactly 2 of the next 4 components tested survive.
Solution:
Where Does the Name Binomial Come From?
The binomial distribution derives its name from the binomial expansion of , where the terms correspond to the values of for .
Since :
Binomial sums are often used to find P(X < r) or .
Example 1
The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known to have contracted this disease, what is the probability that:
(a) at least 10 survive
(b) from 3 to 8 survive
(c) exactly 5 survive?
Solution: Let be the number of people who survive.
(a) P(X \geq 10) = 1 - P(X < 10) = 1 - \sum_{x=0}^{9} b(x; 15, 0.4) = 1 - 0.9662 = 0.0338
(b)
(c)
Example 2
A retailer purchases electronic devices with a 3% defective rate.
(a) If the inspector picks 20 items, what is the probability of at least one defective item?
(b) If the retailer receives 10 shipments and tests 20 devices per shipment, what is the probability of exactly 3 shipments containing at least one defective device?
Solution:
(a) Let be the number of defective devices among 20. . Therefore,
(b) Let be the number of shipments containing at least one defective item, so follows a binomial distribution . Therefore,
Areas of Application
Theorem 5.1
The mean and variance of the binomial distribution are: and
Example 1
A rural community conjectures that 30% of drinking wells have an impurity. 10 wells are randomly selected for testing.
(a) What is the probability that exactly 3 wells have the impurity, assuming the conjecture is correct?
(b) What is the probability that more than 3 wells are impure?
Solution:
(a)
(b) P(X > 3) = 1 - 0.6496 = 0.3504
Example 2
Using the previous example where and , find the mean and variance, and interpret the interval using Chebyshev's theorem.
Solution:
The interval is , or from 2.206 to 9.794.
By Chebyshev's theorem, the number of recoveries among 15 patients has a probability of at least of falling between 2 and 10 inclusive.
Multinomial Distribution
If a trial can result in outcomes with probabilities , then the probability distribution of the random variables , representing the number of occurrences for in independent trials, is:
where
Example
An airport with three runways has the following probabilities for runway access by a randomly arriving jet: Runway 1: , Runway 2: , Runway 3: .
What is the probability that 6 randomly arriving airplanes are distributed as follows: 2 on Runway 1, 1 on Runway 2, and 3 on Runway 3?
Solution:
Hypergeometric Distribution
Areas of Application
Applications for the hypergeometric distribution are found in acceptance sampling, electronic testing, and quality assurance.
Deals with the probability of selecting successes from items labeled as successes and failures from items labeled as failures when a sample of size is selected from items.
Hypergeometric Experiment Properties
A random sample of size is selected without replacement from items.
Of the items, are classified as successes and are classified as failures.
Hypergeometric Random Variable
The number of successes is called a hypergeometric random variable.
The probability distribution is denoted by .
Hypergeometric Distribution in Acceptance Sampling
Similar to the binomial distribution, it's used in acceptance sampling to determine whether to accept a lot of materials or parts.
Example
A part is sold in lots of 10. A lot is acceptable if it has no more than one defective. A sampling plan tests 3 parts out of 10, and the lot is accepted if none are defective.
Comment on the utility of this plan.
Solution: Suppose the lot has 2 defective parts. The probability of accepting the lot is:
Thus, the plan allows acceptance of an unacceptable lot (with 2 defectives) about 47% of the time, so it is considered faulty.
Hypergeometric Distribution
The probability distribution of the hypergeometric random variable (number of successes in a sample of size from items, where are successes and are failures) is:
with
Example
Lots of 40 components are unacceptable if they contain 3 or more defectives. The sampling procedure is to select 5 components and reject the lot if a defective is found.
What is the probability of finding exactly 1 defective in the sample if there are 3 defectives in the lot?
Solution:
This plan detects a bad lot (3 defectives) only about 30% of the time, so it is not desirable.
Theorem 5.2
The mean and variance of the hypergeometric distribution are:
Example
Find the mean and variance for the previous example (N = 40, n = 5, k = 3), and interpret the interval using Chebyshev's theorem.
Solution:
The interval is , or from -0.741 to 1.491.
By Chebyshev's theorem, the number of defectives obtained has a probability of at least of falling between -0.741 and 1.491. Therefore, at least three-fourths of the time, the 5 components include fewer than 2 defectives.
Multivariate Hypergeometric Distribution
If items can be partitioned into cells with elements, respectively, then the probability distribution of the random variables representing the number of elements selected from in a random sample of size , is:
where
and
Example
A group of 10 individuals is used for a biological case study: 3 with blood type O, 4 with blood type A, and 3 with blood type B.
What is the probability that a random sample of 5 will contain 1 person with blood type O, 2 people with blood type A, and 2 people with blood type B?
Solution:
Negative Binomial and Geometric Distributions
Instead of the probability of successes in trials, where is fixed, we are now interested in the probability that the success occurs on the trial.
Experiments of this kind are called negative binomial experiments.
What is the Negative Binomial Random Variable?
The number of trials required to produce successes in a negative binomial experiment is called a negative binomial random variable.
Its probability distribution is called the negative binomial distribution and denoted as .
Negative Binomial Distribution
If repeated independent trials can result in a success with probability and a failure with probability , then the probability distribution of the random variable , the number of the trial on which the success occurs, is:
Example
In an NBA championship series (best of 7), teams A and B face each other. Team A has a probability of 0.55 of winning a game.
(a) What is the probability that team A will win the series in 6 games?
(b) What is the probability that team A will win the series?
(c) If teams A and B were facing each other in a regional playoff series that is decided by winning three out of five games, what is the probability that team A would win the series?
Solution:
(a)
(b)
(c)
Geometric Distribution
If repeated independent trials can result in a success with probability and a failure with probability , then the probability distribution of the random variable , the number of the trial on which the first success occurs, is:
Example 1
For a manufacturing process, 1 in every 100 items is defective. What is the probability that the fifth item inspected is the first defective item found?
Solution:
Example 2
At a busy time, the probability of a telephone connection is . What is the probability that 5 attempts are necessary for a successful call?
Solution:
Theorem 5.3
The mean and variance of a random variable following the geometric distribution are:
Poisson Distribution and the Poisson Process
Experiments yielding numerical values of a random variable , the number of outcomes occurring during a given time interval or in a specified region, are called Poisson experiments.
A Poisson experiment is derived from the Poisson process and possesses the following properties.
Properties of the Poisson Process
The number of outcomes occurring in one time interval or specified region of space is independent of the number that occurs in any other disjoint time interval or region (no memory).
The probability that a single outcome will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region.
The probability that more than one outcome will occur in a short time interval or small region is negligible.
Poisson Random Variable
The number of outcomes occurring during a Poisson experiment is called a Poisson random variable, and its probability distribution is called the Poisson distribution.
is the rate of occurrence of outcomes, denoted by .
Poisson Distribution
The probability distribution of the Poisson random variable , representing the number of outcomes occurring in a given time interval or specified region denoted by , is:
is the average number of outcomes per unit time, distance, area, or volume, and
Approximation of Binomial Distribution by a Poisson Distribution
If is large and is close to 0, the Poisson distribution can be used, with , to approximate binomial probabilities.
If is close to 1, interchange success and failure to change to a value close to 0.
Theorem 5.4
Both the mean and the variance of the Poisson distribution are
Nature of the Poisson Probability Function
Plots of the probability function for , , and show the distribution for different means.
Approximation of Binomial Distribution
If is quite large and is small, the conditions simulate the continuous space or time implications of the Poisson process.
The independence among Bernoulli trials is consistent with the Poisson process.
Theorem 5.5
Let be a binomial random variable with probability distribution . When , , and remains constant,
Example 1
In an industrial facility, the probability of an accident on any given day is 0.005, and accidents are independent.
(a) What is the probability that in any given period of 400 days there will be an accident on one day?
(b) What is the probability that there are at most three days with an accident?
Solution:
Let be a binomial random variable with and . Thus, .(a)
(b)
Example 2
In a manufacturing process where glass products are made, defects or bubbles occur, and on average, 1 in every 1000 has one or more bubbles.
What is the probability that a random sample of 8000 will yield fewer than 7 items possessing bubbles?
Solution: and . Approximate using the Poisson distribution with .
P(X < 7) = \sum{x=0}^{6} b(x; 8000, 0.001) \approx \sum{x=0}^{6} p(x; 8) \approx 0.3134