Discrete Probability Distributions
Chapter 5
In this chapter, you will learn:
Properties of a probability distribution
How to calculate expected value and variance of a probability distribution
How to calculate covariance and understand its use in finance
How to calculate probabilities for the following distributions: Binomial, Hypergeometric, and Poisson
How to use Binomial, Hypergeometric, and Poisson distributions to solve business problems
Discrete Variables: Outcomes arise from a counting process (e.g., the number of courses you choose to take).
Continuous Variables: Results arise from a measurement (e.g., your annual salary or your weight).
Discrete Variables
Continuous Variables
Can take a measurable number of values.
Examples:
Rolling a die twice: Let X be the number of times 4 occurs (X could be 0, 1, or 2).
Flipping a coin 5 times: Let X be the number of "heads" (X = 0, 1, 2, 3, 4, or 5).
A probability distribution for a discrete variable is a mutually exclusive list of all possible numerical results for that variable along with the probability of occurrence of each outcome.
Example: Daily network outages
0 days: 0.35
1 day: 0.25
2 days: 0.20
3 days: 0.10
4 days: 0.05
5 days: 0.05
Probability distributions are often represented graphically.
The expected value (or mean) of a discrete variable (Weighted average)
Daily Outages (xi) | Probability P(X=xi) | xiP(X=xi) |
---|---|---|
0 | 0.35 | 0.00 |
1 | 0.25 | 0.25 |
2 | 0.20 | 0.40 |
3 | 0.10 | 0.30 |
4 | 0.05 | 0.20 |
5 | 0.05 | 0.25 |
Total | 1.00 | 1.40 |
Variance measures the dispersion of a discrete variable.
Standard deviation of a discrete variable is calculated where:
E(X) = expected value of discrete variable X
xi = the i-th value of X
P(X=xi) = Probability of i-th occurrence of X
Daily network outages:
Daily Outages (xi) | Probability P(X=xi) | [xi - E(X)]² | [xi - E(X)]²P(X=xi) |
---|---|---|---|
0 | 0.35 | 1.96 | 0.686 |
1 | 0.25 | 0.16 | 0.040 |
2 | 0.20 | 0.36 | 0.072 |
3 | 0.10 | 2.56 | 0.256 |
4 | 0.05 | 6.76 | 0.338 |
5 | 0.05 | 12.96 | 0.648 |
Variance = σ² = 2.04, Standard Deviation = σ = 1.4283
Covariance measures the strength of the relationship between two discrete variables X and Y.
Positive covariance indicates a positive relationship.
Negative covariance indicates a negative relationship.
Covariance formula:
where:
X = discrete variable X
xi = the i-th value of X
Y = discrete variable Y
yi = the i-th value of Y
P(X=xi,Y=yi) = probability of simultaneous occurrence of the i-th value of X and the i-th value of Y
Consider the returns of two investments of $1000 each under three different economic conditions.
Economic Conditions | Probability | Investment A | Investment B |
---|---|---|---|
Recession | 0.2 | -$25 | -$200 |
Stable Economy | 0.5 | +$50 | +$60 |
Growing Economy | 0.3 | +$100 | +$350 |
Expected returns for each investment:
E(X) = μX = (-25)(.2) + (50)(.5) + (100)(.3) = 50
E(Y) = μY = (-200)(.2) + (60)(.5) + (350)(.3) = 95
Interpretation: Fund A has an average return of $50, and Fund B has an average return of $95 for each $1,000 investment.
Interpretation: Although Fund B has a higher average return, it possesses greater volatility, thus increasing the chance of losses.
Given that covariance is large and positive, there is a positive relationship between the two mutual funds, suggesting both are likely to increase or decrease together.
The expected value of the sum of two variables:
The variance of the sum of two variables:
The standard deviation of the sum of two variables:
Investment portfolios usually comprise several different combinations of funds (variables).
Expected return and standard deviation can be computed simultaneously for two mutual funds.
Investment Goal: Maximize average return while minimizing risk (standard deviation).
Expected portfolio return (weighted average return):
Portfolio risk (weighted volatility); where w = proportion of investment X in the portfolio value, (1 - w) = proportion of investment Y in the portfolio value.
Investment X: μX = 50 σX = 43.30
Investment Y: μY = 95 σY = 193.21
σXY = 8250
If investment X constitutes 40% of the portfolio and Y 60%, portfolio return and risk (volatility) are computed based on their respective values.
Binomial
Hypergeometric
Poisson
Discrete Probability Distributions
Normal
Uniform
Exponential
Fixed number of observations, n (e.g., 15 coin tosses; 10 light bulbs from a warehouse).
Each observation is classified as whether the "event of interest" occurred (e.g., heads or tails on each flip).
The probability of an observation belonging to the category of the event of interest is denoted p; the probability that the event does not occur is (1 - p).
The probability of an event occurring (p) is constant across all observations.
Observations are independent.
The outcome of one observation does not affect the outcome of another.
Two sampling methods provide independence:
Infinite population without replacement
Finite population with replacement
An industrial unit characterized products as defective or acceptable.
A company bidding for contracts will either accept a contract or not.
A market research company receives responses to the survey "yes, I will buy" or "no, I will not buy".
New job applicants either accept the offer or reject it.
Assuming the event of interest is obtaining heads when tossing a fair coin. Toss the coin three times.
How many different ways can you achieve two heads? - Possible Ways: HHT, HTH, THH, thus there are three ways to get two heads.
This case is simple. We need to be able to count ways for more complex situations.
The number of combinations (combinations) choosing x elements from a population of n elements is given by the equation:
n! = (n)(n - 1)(n - 2) ... (2)(1)
x! = (x)(x - 1)(x - 2) ... (2)(1)
0! = 1 (by definition)
How many possible combinations of 3 ice cream scoops can you create at an ice cream shop if you can choose from 31 flavors and no flavor can be used more than once among the 3 scoops?
Total choices are n = 31, and we are choosing x = 3.
P(X=x|n,p) = probability of x successes in n observations with probability p for each trial.
x = number of occurrences of the event of interest in the sample, (x = 0, 1, 2, ..., n)
n = number of observations (sample size)
p = probability of occurrence of the event of interest
Example: Suppose x = # occurrences of "heads" in a coin toss: n = 4 p = 0.5 (1 - p) = (1 - 0.5) = 0.5 x = 0, 1, 2, 3, 4
What is the probability of one success in five trials if the probability of the event is 0.1? x = 1, n = 5, and p = 0.1
Suppose the probability of buying a defective computer is 0.02. What is the probability of buying two defective computers from a sample of 10 computers? x = 2, n = 10, and p = 0.02
The shape of the binomial distribution depends on the values of p and n.
If n = 5 and p = 0.1
If n = 5 and p = 0.5
n = 10
p=0.20, p=0.25, p=0.30, p=0.35, p=0.40, p=0.45, p=0.50
Examples: n = 10, p = 0.35, x = 3: P(X=3|10,0.35) = 0.2522
n = 10, p = 0.75, x = 8: P(X=8|10,0.75) = 0.2816
Mean, Variance, and Standard Deviation where
n = sample size
p = probability of occurrence of the event in each trial
(1 - p) = probability of not occurring of the event in each trial
Various shapes of the binomial distribution are observed.
Both Excel and Minitab can be used to compute binomial distributions.
Use the Poisson distribution when interested in the number of times an event occurs in a given opportunity area.
An opportunity area is a continuous unit or time interval, volume, or area where more than one event may occur.
Example: Number of scratches in a car paint, number of mosquito bites on a person, number of non-responses by a computer in a day.
Apply Poisson distribution when:
You want to measure how many times an event occurs in a given opportunity area.
The probability of an event occurring in an opportunity area is the same for all opportunity areas.
The number of events occurring in one opportunity area is independent of the number of events occurring in any other area.
The probability of two or more events occurring in a single opportunity area approaches zero as the area becomes smaller.
The average number of events per unit is λ (lambda).
Where:
x = number of events in an opportunity area
λ = expected number of events
e = base of the natural logarithm (2.71828...)
Mean, Variance, and Standard Deviation where
λ = expected number of events
(Available Online)
Example: Find P(X = 2 | λ = 0.50)
Both Excel and Minitab can be used for Poisson distribution computations.
When λ = 0.50
When λ = 3.00
The shape of the Poisson distribution depends on the parameter λ.
Binomial distribution applies when the sample is chosen with replacement from an infinite population or without replacement from a finite population.
Hypergeometric distribution applies when sampling without replacement from a finite population.
Sample size "n" from a finite population size "N"
The sampling is done without replacement.
The results of observations are dependent.
Interested in finding the probability of occurrence of the event of interest X times in the sample when there are "E" events of interest in the population.
Where
N = population size
E = number of events of interest in the population
N - E = number of non-events of interest in the population
n = sample size
x = number of events of interest in the sample
n - x = number of non-events of interest in the sample
The mean of the hypergeometric distribution is:
The standard deviation is:
Where the correction factor of the finite population occurs due to sampling without replacement from a finite population.
Example: Examining 3 different computers from a population of 10 computers. - 4 out of the 10 computers use illegal software. - What is the probability that 2 out of the 3 selected computers use illegal software?
N = 10, n = 3, A = 4, x = 2
Probability that 2 out of the 3 selected computers use illegal software is 0.30, or 30%.
Both Excel and Minitab can be used for hypergeometric distribution calculations.
The use of Poisson distribution as an approximation of the binomial distribution can be found online.
In this chapter, we covered:
The probability distribution of a discrete variable
Covariance and its application in finance
The binomial distribution
The hypergeometric distribution
Online Topic Chapter 5
Understand the application of Poisson distribution as an approximation of the binomial distribution.
The binomial distribution is discrete while the normal distribution is continuous.
The use of the normal distribution as an approximation to the binomial distribution improves accuracy if continuity correction is applied.
Example: If X is discrete in a binomial distribution, P(X = 4 | n, p) can be approximated with a continuous normal distribution by finding P(3.5 < X < 4.5).
As p approaches 0.5, the approximation of Poisson to binomial improves.
The larger the sample size n, the better the approximation of Poisson to binomial.
General rule: The normal distribution can be used to approximate the binomial if nP ≥ 5 and n(1 - P) ≥ 5 (continuation).
The mean and standard deviation of the binomial distribution are as follows:
μ = nP
The binomial is transformed into normal by the formula: (continuation).
If n = 1000 and p = 0.2, P(X ≤ 180)?
Approximate P(X ≤ 180 | 1000, 0.2) using continuity correction: P(X ≤ 180.5)
Transform to standard normal distribution:
P(Z ≤ -1.54) = 0.0618.
Finding approximations of probability distributions using the normal distribution.