Probability Topics and Discrete Random Variables

Probability Topics

Introduction
- Probability is used to guess the outcome of an event in order to make a decision.
- Deals with the chance of an event occurring.
- Instructor will survey class to record P(change), P(bus), P(change AND bus), P(change|bus).

Terminology

Probability: A measure associated with how certain we are of outcomes of an experiment.
Experiment: A planned operation carried out under controlled conditions.
- Chance Experiment: If the result is not predetermined.
- Flipping a fair coin twice is an example of an experiment.
Outcome: A result of an experiment.
Sample Space (S): The set of all possible outcomes.
- Represented by listing outcomes, tree diagram, or Venn diagram.
- Example: Flipping one fair coin, $S = {H, T}$ , where H = heads, T = tails.
Event: Any combination of outcomes. Represented by uppercase letters (e.g., A, B).
- Example: Flipping one fair coin, event A might be getting at most one head.
Probability of an Event A: Denoted as $P(A)$ .
- The long-term relative frequency of that outcome.
- Probabilities are between 0 and 1, inclusive.
  - $P(A) = 0$ : Event A can never happen.
  - $P(A) = 1$ : Event A always happens.
  - $P(A) = 0.5$ : Event A is equally likely to occur or not.
- Example: Flipping a fair coin repeatedly, the relative frequency of heads approaches 0.5.
Equally Likely: Each outcome of an experiment occurs with equal probability.
- Example: Tossing a fair, six-sided die, each face (1, 2, 3, 4, 5, or 6) is as likely to occur as any other face.
Calculating Probability When Outcomes Are Equally Likely
- Count the number of outcomes for event A and divide by the total number of outcomes in the sample space.
- Example: Tossing a fair dime and a fair nickel, $S = {HH, TH, HT, TT}$ . A = getting one head. There are two outcomes {HT, TH}, so $P(A) = \frac{2}{4} = 0.5$ .
- The long-term relative frequency of obtaining this result would approach the theoretical probability as the number of repetitions grows larger.
Law of Large Numbers: As the number of repetitions of an experiment increases, the relative frequency obtained in the experiment tends to become closer and closer to the theoretical probability.
Not Equally Likely: In many situations, the outcomes are not equally likely. A coin or die may be unfair or biased.
"OR" Event: An outcome is in the event A OR B if the outcome is in A or is in B or is in both A and B.
- Example: If $A = {1, 2, 3, 4, 5}$ and $B = {4, 5, 6, 7, 8}$ , then $A OR B = {1, 2, 3, 4, 5, 6, 7, 8}$ .
"AND" Event: An outcome is in the event A AND B if the outcome is in both A and B at the same time.
- Example: If $A = {1, 2, 3, 4, 5}$ and $B = {4, 5, 6, 7, 8}$ , then $A AND B = {4, 5}$ .
Complement of Event A: Denoted as A′. Consists of all outcomes that are NOT in A.
- $P(A) + P(A') = 1$ .
- Example: If $S = {1, 2, 3, 4, 5, 6}$ and $A = {1, 2, 3, 4}$ , then $A' = {5, 6}$ . $P(A) = \frac{4}{6}$ , $P(A') = \frac{2}{6}$ , and $P(A) + P(A') = \frac{4}{6} + \frac{2}{6} = 1$
Conditional Probability of A Given B: Written as $P(A|B)$ . The probability that event A will occur given that event B has already occurred.
- Reduces the sample space. Calculates probability of A from the reduced sample space B.
- Formula: $P(A|B) = \frac{P(A \text{ AND } B)}{P(B)}$ where P(B) > 0.
- Example: Toss a fair, six-sided die. $S = {1, 2, 3, 4, 5, 6}$ . Let A = face is 2 or 3 and B = face is even (2, 4, 6). To calculate $P(A|B)$ , we count the number of outcomes 2 or 3 in the sample space B = {2, 4, 6}. Then we divide that by the number of outcomes B (rather than S).
Remember that understanding the wording is the first very important step in solving probability problems.

Independent and Mutually Exclusive Events

Independent and mutually exclusive do not mean the same thing.

Independent Events

Two events are independent if the knowledge that one occurred does not affect the chance the other occurs.
Conditions:
- $P(A|B) = P(A)$
- $P(B|A) = P(B)$
- $P(A \text{ AND } B) = P(A)P(B)$
To show two events are independent, you must show only one of the above conditions.
If two events are NOT independent, then we say that they are dependent.
Sampling
- With Replacement: Each member of a population is replaced after it is picked and has the possibility of being chosen more than once. Events are considered to be independent.
- Without Replacement: Each member of a population may be chosen only once. Events are considered to be dependent or not independent.
If it is not known whether A and B are independent or dependent, assume they are dependent until you can show otherwise.

Mutually Exclusive Events

A and B are mutually exclusive events if they cannot occur at the same time.
They do not share any outcomes and $P(A \text{ AND } B) = 0$ .
Example: $S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}$ . Let $A = {1, 2, 3, 4, 5}$ , $B = {4, 5, 6, 7, 8}$ , and $C = {7, 9}$ .
- $A \text{ AND } B = {4, 5}$ . $P(A \text{ AND } B) = \frac{2}{10} \neq 0$ . Therefore, A and B are not mutually exclusive.
- A and C do not have any numbers in common so $P(A \text{ AND } C) = 0$ . Therefore, A and C are mutually exclusive.
If it is not known whether A and B are mutually exclusive, assume they are not until you can show otherwise.

Two Basic Rules of Probability

When calculating probability, there are two rules to consider when determining if two events are independent or dependent and if they are mutually exclusive or not.

Multiplication Rule

If A and B are two events defined on a sample space, then: $P(A \text{ AND } B) = P(B)P(A|B)$ . This rule may also be written as: $P(A|B) = \frac{P(A \text{ AND } B)}{P(B)}$
If A and B are independent, then $P(A|B) = P(A)$ . Then $P(A \text{ AND } B) = P(A|B)P(B)$ becomes $P(A \text{ AND } B) = P(A)P(B)$ .

Addition Rule

If A and B are defined on a sample space, then: $P(A \text{ OR } B) = P(A) + P(B) - P(A \text{ AND } B)$ .
If A and B are mutually exclusive, then $P(A \text{ AND } B) = 0$ . Then $P(A \text{ OR } B) = P(A) + P(B) - P(A \text{ AND } B)$ becomes $P(A \text{ OR } B) = P(A) + P(B)$ .

Contingency Tables

A contingency table provides a way of portraying data that can facilitate calculating probabilities. The table helps in determining conditional probabilities quite easily. The table displays sample values in relation to two different variables that may be dependent or contingent on one another.

Tree and Venn Diagrams

Sometimes, when the probability problems are complex, it can be helpful to graph the situation. Tree diagrams and Venn diagrams are two tools that can be used to visualize and solve conditional probabilities.

Tree Diagrams

A tree diagram is a special type of graph used to determine the outcomes of an experiment. It consists of "branches" that are labeled with either frequencies or probabilities. Tree diagrams can make some probability problems easier to visualize and solve.

Venn Diagram

A Venn diagram is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space S together with circles or ovals. The circles or ovals represent events.

Discrete Random Variables

A random variable describes the outcomes of a statistical experiment in words.
- Values can vary with each repetition of an experiment.
Notation
- Upper case letters (e.g., X, Y) denote a random variable.
- Lower case letters (e.g., x, y) denote the value of a random variable.
- If X is a random variable, then X is written in words, and x is given as a number.

Probability Distribution Function (PDF) for a Discrete Random Variable

Two characteristics:
1. Each probability is between zero and one, inclusive.
2. The sum of the probabilities is one.
$P(x)$ = probability that X takes on a value x.

Mean or Expected Value and Standard Deviation

The expected value is often referred to as the "long-term" average or mean.
Law of Large Numbers: As the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency approaches zero.
The “long-term average” is known as the mean or expected value of the experiment and is denoted by the Greek letter $\mu$ .
To find the expected value or long term average, $\mu$ , simply multiply each value of the random variable by its probability and add the products.
To calculate the standard deviation ( $\sigma$ ) of a probability distribution, find each deviation from its expected value, square it, multiply it by its probability, add the products, and take the square root.

Binomial Distribution

Three characteristics of a binomial experiment.
1. There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
2. There are only two possible outcomes, called "success" and "failure," for each trial. The letter p denotes the probability of a success on one trial, and q denotes the probability of a failure on one trial. $p + q = 1$ .
3. The n trials are independent and are repeated using identical conditions. Because the n trials are independent, the outcome of one trial does not help in predicting the outcome of another trial. Another way of saying this is that for each individual trial, the probability, p, of a success and probability, q, of a failure remain the same.
The outcomes of a binomial experiment fit a binomial probability distribution.
- The random variable X = the number of successes obtained in the n independent trials.

Bernoulli Trial

Any experiment that has characteristics two and three and where n = 1 is called a Bernoulli Trial.
A binomial experiment takes place when the number of successes is counted in one or more Bernoulli Trials.
For the binomial probability distribution:
- $\mu = np$
- $\sigma^2 = npq$
- $\sigma = \sqrt{npq}$

Notation for the Binomial

B = Binomial Probability Distribution Function
$X \sim B(n, p)$
- Read this as "X is a random variable with a binomial distribution."
- The parameters are n and p;
  - n = number of trials,
  - p = probability of a success on each trial.
To calculate (x = value): binompdf(n, p, number) if "number" is left out, the result is the binomial probability table.
To calculate P(x ≤ value): binomcdf(n, p, number) if "number" is left out, the result is the cumulative binomial probability table.

Geometric Distribution

Three main characteristics of a geometric experiment.
1. There are one or more Bernoulli trials with all failures except the last one, which is a success. In other words, you keep repeating what you are doing until the first success. Then you stop.
2. In theory, the number of trials could go on forever. There must be at least one trial.
3. The probability, p, of a success and the probability, q, of a failure is the same for each trial. p + q = 1 and q = 1 − p.
X = the number of independent trials until the first success.

Notation for the Geometric

$G$ = Geometric Probability Distribution Function
$X \sim G(p)$
- Read this as "X is a random variable with a geometric distribution."
- The parameter is p;
  - p = the probability of a success for each trial.
$\mu = \frac{1}{p}$
$\sigma^2 = \frac{1}{p}(\frac{1}{p} - 1)$

Hypergeometric Distribution

Five characteristics of a hypergeometric experiment.
1. Take samples from two groups.
2. Concerned with a group of interest, called the first group.
3. Sample without replacement from the combined groups.
4. Each pick is not independent, since sampling is without replacement.
5. Not dealing with Bernoulli Trials.
The outcomes of a hypergeometric experiment fit a hypergeometric probability distribution.
- The random variable X = the number of items from the group of interest.

Notation for the Hypergeometric

$H$ = Hypergeometric Probability Distribution Function
$X \sim H(r, b, n)$
- Read this as "X is a random variable with a hypergeometric distribution."
- The parameters are $\mu, b$ , and n;
  - \mu$ = the size of the group of interest (first group),
  - b = the size of the second group,
  - n = the size of the chosen sample.
\mu = n(\frac{r}{r + b})

Poisson Distribution

Two main characteristics of a Poisson experiment.
1. The Poisson probability distribution gives the probability of a number of events occurring in a fixed interval of time or space if these events happen with a known average rate and independently of the time since the last event.
2. The Poisson distribution may be used to approximate the binomial if the probability of success is "small" (such as 0.01) and the number of trials is "large" (such as 1,000).
  - n is the number of trials, and p is the probability of a "success."
The random variable X = the number of occurrences in the interval of interest.

Notation for the Poisson

P = Poisson Probability Distribution Function
X \sim P(\mu)
- Read this as "X is a random variable with a Poisson distribution."
- The parameter is \mu
  - \mu $(or$ \lambda) = the mean for the interval of interest.
\sigma = \sqrt{\mu}
\sigma = \mu$$