Chapter 3: Discrete Random Variables and Probability Distributions

Discrete Random Variables & Probability Distributions

Chapter Outline

3.1 Probability Distributions and Probability Mass Functions
3.2 Cumulative Distribution Functions
3.3 Mean and Variance of a Discrete Random Variable
3.4 Discrete Uniform Distribution
3.5 Binomial Distribution
3.6 Geometric and Negative Binomial Distributions
3.7 Hypergeometric Distribution
**3.8 Poisson Distribution

Learning Objectives

Determine Probabilities: Calculate probabilities from probability mass functions and vice versa.
Cumulative Distribution Functions: Determine probabilities and probability mass functions from cumulative distribution functions and vice versa.
Calculate Means and Variances: Compute means and variances for discrete random variables.
Understand Assumptions: Grasp the assumptions underlying discrete probability distributions.
Select Appropriate Distributions: Choose the correct discrete probability distribution for probability calculations.
Calculate Probabilities and Determine Means and Variances: For common discrete probability distributions.

Probability Distributions and Mass Functions

A random variable is a function assigning a real number to each outcome in a random experiment's sample space.
The probability distribution of a random variable $X$ describes the probabilities associated with its possible values.
A discrete random variable has a probability distribution specifying the list of possible values of $X$ along with their probabilities, or expressed via a function or formula.

Example 3.1: Flash Recharge Time

Testing recharge time in three cellphone cameras.
Probability of a camera passing the test is 0.8, and cameras perform independently.
Table 3.1 shows the sample space and probabilities.
- Example: Probability of first two cameras passing and third failing (ppf) is $P(ppf) = (0.8)(0.8)(0.2) = 0.128$ .
The random variable $X$ denotes the number of cameras passing the test.

Example 3.3: Digital Channel

A bit transmitted through a digital channel may be received in error.
Let $X$ equal the number of bits received in error in the next 4 bits transmitted.
The probability distribution of $X$ is given by possible values and their probabilities.
- $P(X = 0) = 0.6561$
- $P(X = 1) = 0.2916$
- $P(X = 2) = 0.0486$
- $P(X = 3) = 0.0036$
- $P(X = 4) = 0.0001$

Probability Mass Function

For a discrete random variable $X$ with possible values $x1, x2, …, x_n$ , a probability mass function satisfies:

$f(x_i) ≥ 0$
$\sum{i=1}^{n} f(xi) = 1$
$f(xi) = P(X = xi)$

Example 3.4: Wafer Contamination

Let $X$ be the number of wafers analyzed to detect a large particle of contamination.
Probability that a wafer contains a large particle is 0.01, and wafers are independent.
Sample space: $S = {p, ap, aap, aaap, …}$ , where $p$ denotes a wafer with a large particle and $a$ denotes a wafer without.
Range of values of $X$ is $x = 1, 2, 3, 4, …$
- $P(X = 1) = 0.01$
- $P(X = 2) = (0.99) * 0.01 = 0.0099$
- $P(X = 3) = (0.99)^2 * 0.01 = 0.009801$
- $P(X = 4) = (0.99)^3 * 0.01 = 0.009703$
General formula: $P(X = x) = P(aa…ap) = (0.99)^{x-1}(0.01)$ , for $x = 1, 2, 3, …$

Cumulative Distribution Functions

The cumulative distribution function is the probability that a random variable $X$ , with a given probability distribution, will be found at a value less than or equal to $x$ .
Symbolically, $F(x) = P(X ≤ x) = \sum{xi≤x} f(x_i)$

Example 3.5

Consider the probability distribution for the digital channel example. Find the probability of three or fewer bits in error.

The event $(X ≤ 3)$ is the total of the events: $(X = 0)$ , $(X = 1)$ , $(X = 2)$ , and $(X = 3)$ .
For the probability calculation, see the shaded row in the table.

Example 3.6

Determine the probability mass function of $X$ from the following cumulative distribution function:

$F(x) = \begin{cases} 0 & x < -2 \ 0.2 & -2 ≤ x < 0 \ 0.7 & 0 ≤ x < 2 \ 1 & 2 ≤ x \ \end{cases}$

$f(-2) = 0.2 - 0 = 0.2$

$f(0) = 0.7 - 0.2 = 0.5$

$f(2) = 1.0 - 0.7 = 0.3$

Note: Even if the random variable $X$ can assume only integer values, the cumulative distribution function is defined at non-integer values.

Cumulative Distribution Function and Properties

For a discrete random variable $X$ , $F(x)$ satisfies the following properties:

$F(x) = P(X ≤ x) = \sum{xi≤x} f(x_i)$
$0 ≤ F(x) ≤ 1$
If $x ≤ y$ , then $F(x) ≤ F(y)$

Mean and Variance of a Discrete Random Variable

Used to summarize a probability distribution
Mean: measure of center or middle of the probability distribution
For a discrete random variable, a weighted average of possible values with weights equal to probabilities
Variance: measure of the dispersion, or variability in the distribution
For a discrete random variable, a weighted measure of each possible squared deviation with weights equal to probabilities
Mean or expected value: $\mu = E[X] = \sum_{x} xf(x)$
Variance: $\sigma^2 = V(X) = E[(X - \mu)^2] = \sum<em>{x} (x - \mu)^2 f(x) = \sum</em>{x} x^2 f(x) - \mu^2$
Standard deviation: $\sigma = \sqrt{\sigma^2}$

Example 3.7: Digital Channel

$X$ is the number of bits received in error of the next 4 transmitted. Use the table to calculate the mean & variance.

Mean: $\mu = E[X] = 0f(0) + 1f(1) + 2f(2) + 3f(3) + 4f(4) = 00.6561 + 10.2916 + 20.0486 + 30.0036 + 40.0001 = 0.4$

Variance: $\sigma^2 = V(X) = \sum{i=1}^{5} f(xi)(x_i - 0.4)^2 = 0.36$

Expected Value of a Function of a Discrete Random Variable

If $X$ is a discrete random variable with probability mass function $f(x)$

The variance can be considered as an expected value of a specific function of $X$ , namely, $h(X) = (X - \mu)^2$

$E[h(x)] = \sum_{x} h(x)f(x)$

Example 3.9: Digital Channel

What is the expected value of the square of the number of bits in error? $h(X) = X^2$

Discrete Uniform Distribution

Let $X$ be a discrete random variable ranging from $a, a+1, a+2, …, b$ , for $a ≤ b$ .
There are $b - (a - 1)$ values in the inclusive interval.
Therefore, $f(x) = \frac{1}{b-a+1}$

Mean and Variance of Discrete Uniform Distribution

Mean: $\mu = \frac{a+b}{2}$
Variance: $\sigma^2 = \frac{(b-a+1)^2 - 1}{12}$

Example 3.11: Number of Voice Lines

Let the random variable $X$ denote the number of the 48 voice lines that are in use at a particular time. Assume that $X$ is a discrete uniform random variable with a range of 0 to 48. Find $E(X)$ & $\sigma$ .

Practical Interpretation: The average number of lines in use is 24, but the dispersion (as measured by $\sigma$ ) is large. Therefore, at many times far more or fewer than 24 lines are used.

Binomial Distribution

The random variable $X$ that equals the number of trials that result in a success is a binomial random variable with parameters 0 < p < 1 and $n = 1, 2, …$ . The probability mass function is:

$f(x) = \binom{n}{x} p^x (1-p)^{n-x}$ , for $x = 0, 1, …, n$

For constants $a$ and $b$ , the binomial expansion is:

$(a + b)^n = \sum_{k=0}^{n} \binom{n}{k} a^k b^{n-k}$

Example 3.14: Binomial Coefficient

Exercises in binomial coefficient calculation:

$\binom{10}{3} = \frac{10!}{3!7!} = \frac{10 \cdot 9 \cdot 8 \cdot 7!}{3 \cdot 2 \cdot 1 \cdot 7!} = 120$

$\binom{15}{10} = \frac{15!}{10!5!} = \frac{15 \cdot 14 \cdot 13 \cdot 12 \cdot 11 \cdot 10!}{5 \cdot 4 \cdot 3 \cdot 2 \cdot 1 \cdot 10!} = 3003$

$\binom{100}{4} = \frac{100!}{96!4!} = \frac{100 \cdot 99 \cdot 98 \cdot 97 \cdot 96!}{4 \cdot 3 \cdot 2 \cdot 1 \cdot 96!} = 3921225$

Recall: $0! = 1$

Example 3.15a: Organic Pollution

Each sample of water has a 10% chance of containing a particular organic pollutant. Assume that the samples are independent with regard to the presence of the pollutant. Find the probability that, in the next 18 samples, exactly 2 contain the pollutant.

Answer: Let $X$ denote the number of samples that contain the pollutant in the next 18 samples analyzed. Then $X$ is a binomial random variable with $p = 0.1$ and $n = 18$ .

$P(X = 2) = \binom{18}{2} (0.1)^2 (1 - 0.1)^{18-2} = 153 (0.1)^2 (0.9)^{16} = 0.2835$

Example 3.15b: Organic Pollution

Determine the probability that at least 4 samples contain the pollutant.

Answer: The problem calls for calculating $P(X ≥ 4)$ but is easier to calculate the complementary event, $P(X ≤ 3)$ , so that:

$P(X ≥ 4) = 1 - \sum_{x=0}^{3} \binom{18}{x} (0.1)^x (0.9)^{18-x} = 1 - (0.150 + 0.300 + 0.284 + 0.168) = 0.098$

Example 3.15c: Organic Pollution

Determine the probability that 3 ≤ X < 7.

Answer:

P(3 ≤ X < 7) = \sum_{x=3}^{6} \binom{18}{x} (0.1)^x (0.9)^{18-x} = 0.168 + 0.070 + 0.022 + 0.005 = 0.265

Binomial Mean and Variance

If $X$ is a binomial random variable with parameters $p$ and $n$ ,

The mean of $X$ is: $\mu = E(X) = np$
The variance of $X$ is: $\sigma^2 = V(X) = np(1-p)$
These quantities are derived by summing Bernoulli random variables and using the definitions of the mean and variance of discrete random variables.

Example 3.16: Binomial Mean and Variance

For the number of transmitted bits received in error in Example 3.13, $n = 4$ and $p = 0.1$ . Find the mean and variance of the binomial random variable.

Answer:

$\mu = E(X) = np = 4 \cdot 0.1 = 0.4$
$\sigma^2 = V(X) = np(1-p) = 4 \cdot 0.1 \cdot 0.9 = 0.36$

Geometric Distribution

Binomial distribution has
- Fixed number of trials
- Random number of successes
Geometric distribution has reversed roles
- Random number of trials
- Fixed number of successes, in this case 1
$f(x) = (1-p)^{x-1}p$ , for $x = 1, 2, 3, …$

Example 3.18: Wafer Contamination

The probability that a wafer contains a large particle of contamination is 0.01. Assume that the wafers are independent. What is the probability that exactly 125 wafers need to be analyzed before a particle is detected?

Let $X$ denote the number of samples analyzed until a large particle is detected. Then $X$ is a geometric random variable with parameter $p = 0.01$ .

$P(X = 125) = (0.99)^{124}(0.01) = 0.00289$

Geometric Mean and Variance

Mean: $\mu = E(X) = \frac{1}{p}$
Variance: $\sigma^2 = V(X) = \frac{1-p}{p^2}$

Example 3.19: Mean and Standard Deviation

Consider the transmission of bits in Example 3.17. The probability that a bit transmitted through a digital transmission channel is received in error is $p = 0.1$ . Assume that the transmissions are independent events, and let the random variable $X$ denote the number of bits transmitted until the first error. Find the mean and standard deviation.

Mean: $\mu = E(X) = \frac{1}{p} = \frac{1}{0.1} = 10$

Variance: $\sigma^2 = V(X) = \frac{1-p}{p^2} = \frac{0.9}{0.01} = 90$

Standard deviation: $\sqrt{90} = 9.49$

Lack of Memory Property

For a geometric random variable, the trials are independent
Count of the number of trials until the next success can be started at any trial without changing the probability distribution of the random variable.
Implication: the system presumably will not wear out.
For all transmissions the probability of an error remains constant.
Hence, the geometric distribution is said to lack any memory.

Example 3.20: Lack of Memory Property

In Example 3.17, the probability that a bit is transmitted in error is $p = 0.1$ . Suppose 50 bits have been transmitted. What is the mean number of bits transmitted until the next error?

The mean number of bits transmitted until the next error, after 50 bits have already been transmitted, is $\frac{1}{0.1} = 10$ , the same result as the mean number of bits until the first error.

Negative Binomial Distribution

A generalization of a geometric distribution in which the random variable is the number of Bernoulli trials required to obtain $r$ successes

$f(x) = \binom{x-1}{r-1} p^r (1-p)^{x-r}$ , for $x = r, r+1, r+2, …$

Mean & Variance of Negative Binomial

Mean: $\mu = E(X) = \frac{r}{p}$
Variance: $\sigma^2 = V(X) = \frac{r(1-p)}{p^2}$

Example 3.22: Camera Flashes

The probability that a camera passes a particular test is 0.8, and the cameras perform independently. What is the probability that the third failure is obtained in five or fewer tests?

Let $X$ denote the number of cameras tested until three failures have been obtained. The requested probability is $P(X ≤ 5)$ . Here $X$ has a negative binomial distribution with $p = 0.2$ and $r = 3$ .

Hypergeometric Distribution

Samples are selected from a finite population without replacement

$f(x) = \frac{\binom{K}{x} \binom{N-K}{n-x}}{\binom{N}{n}}$ , for $x = max(0, n - (N - K)), …, min(n, K)$

Example 3.23: Sampling without replacement

A day’s production of 850 manufactured parts contains 50 parts that do not conform to customer requirements Two parts are selected at random without replacement from the day’s production. Let A and B denote the events that the first and second parts are nonconforming, respectively. What is the probability that both parts conform, one part does not conform, and both parts do not conform?

Let $X$ denote the number of parts that do not conform. Therefore,

Example 3.24a: Parts from Suppliers

A batch of parts contains 100 parts from a local supplier of circuit boards and 200 parts from a supplier in the next state. If 4 parts are selected randomly, without replacement, what is the probability that they are all from the local supplier?

Example 3.24b: Parts from Suppliers

What is the probability that two or more parts in the sample are from the local supplier?

Example 3.24c: Parts from Suppliers

What is the probability that at least one part in the sample is from the local supplier?

Hypergeometric Mean & Variance

Mean: $\mu = E(X) = n \frac{K}{N}$
Variance: $\sigma^2 = V(X) = n \frac{K}{N} (1 - \frac{K}{N}) \frac{N-n}{N-1}$

Poisson Distribution

The random variable $X$ that equals the number of events in a Poisson process is a Poisson random variable with parameter 0 < \lambda, and

$f(x) = \frac{e^{-\lambda T} (\lambda T)^x}{x!}$

for $x = 0, 1, 2, …$

Example 3.27a: Wire Flaws

Flaws occur at random along the length of a thin copper wire. Let $X$ denote the random variable that counts the number of flaws in a length of $T$ mm of wire and suppose that the average number of flaws is 2.3 per mm. Find the probability of exactly 10 flaws in 5 mm of wire.

Let $X$ denote the number of flaws in 5 mm of wire. Then $X$ has the Poisson distribution with

Example 3.27b: Wire Flaws

Find the probability of at least 1 flaw in 2 mm of wire.

Let $X$ denote the number of flaws in 2 mm of wire. Then $X$ has the Poisson distribution with

Poisson Mean & Variance

The mean and variance of the Poisson model are the same.
For example, if particle counts follow a Poisson distribution with a mean of 25 particles per square centimeter, the variance is also 25 and the standard deviation of the counts is 5 per square centimeter.
If the variance of a data is much greater than the mean, then the Poisson distribution would not be a good model for the distribution of the random variable.