Chapter 6: Random Variables
Random Variable represents the possible outcomes of a random process
AKA Random Result
Ex: X = number of heads in three tosses
Always a capital, italic letter (like X or Y)
The Probability distribution describes the likelihood of various outcomes (random variables), along with their probabilities.
Two Requirements:
Every probability pi is a number between 0 and 1, inclusive.
The sum of the probabilities is 1: p1 + p2 + p3 + . . . = 1.
A discrete random variable has specific, separate, and often whole number values / outcomes
Like (1, 2, 3, etc.) on a die
Discrete usually counts
Digital input in CS
Ex: like number of siblings
For random variabilities (probability in general), use mean instead of median
Mean is represented as mu or μ (usually with a subscript of the random variable)
Ex: Random Variable X, mean = μx
In a Histogram, analyze: shape, center, and variability
Shape Options: Skewed Left/Right & Symmetric
Symmetric can be bimodal or unimodal
Normal curve is always symmetric unimodal
Ex: The graph is roughly symmetric and has a single peak at $600
Mean (expected value) of a discrete random variable is the average value in a random process
Important Part: Expected Value takes frequency into consideration
Don’t take an average of Random Variables (X), since each random variables have different weights / frequencies than others.
Formula: μ_x = E(X) = x1p1 + x2p2 + x3p3 + … = ∑ xi * pi
Ti-Nispire Usage:
Spreadsheet → Enter Data into 2 Columns
Menu → Statics → Stat Calculations → Two-Variable Statistics
Answer is value of ∑ xy
Median is when the cumulative probability equals or exceeds 0.5
Standard Deviation aka σ (omega)
Ti-Nispire Usage:
Spreadsheet → Enter Data into 2 Columns
Menu → Statics → Stat Calculations → Two-Variable Statistics
Answer is value of σx
Variance is (Standard Deviation)²
A Continuous Random Variable takes any value in an interval on the number line (range)
Continuous usually measures
Analog input in Computer Science
Ex: height of a student or time to run a mile
To find its probabilities, use a Density Curve
1/9 × 9 = 1 (Probability always adds to 1)
Question: Would it be faster to take the train or walk? Walking takes 4 mins. What is the probability that Selena's train journey time to work is shorter than her walking time, given that her train journey time (Y) follows a uniform distribution from 2 to 5 minutes?
Solution:
Identify Continuous vs Discrete. This situation involves time (measured), so continuous .
Form the density curve: 5.0 - 2.0 = 3 → 3 × 1/3 = 1 (Probability adds to 1)
Calculate Shaded Area: base × height = (4.0 - 2.0 ) × 1/3 = 2/3
→ P(Y < 4) = 2/3 = 0.667
Sentence: There is a 66.7% chance that it will be quicker for Selena to take the train to work on a randomly selected day.
Question: The heights of young women can be modeled by a Normal distribution with mean µ = 64 inches and standard deviation σ = 2.7 inches. Suppose we choose a young woman at random and let Y = her height (in inches). Find P(68 ≤ Y ≤ 70). Interpret this value.
TODO: Insert Normal Curve Drawing
Answer: normalCdf(lower: 68, upper: 70, mean: 64, SD: 2.7) = 0.051
Adding the same positive number a to (subtracting a from) each value of a random variable:
Adds a to (subtracts a from) measures of center and location (mean, median, quartiles, percentiles).
Mean of Sum/Difference of Random Variables: μs = μx+-y = μx +- μY
Does not change measures of variability (range, IQR, standard deviation).
Variance adds (and subtracts), standard deviations do not add / subtract.
Does not change the shape of the probability distribution.
Multiplying (or dividing) each value of a random variable by the same positive number b:
Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b.
Multiplies (divides) measures of variability (range, IQR, standard deviation) by b.
Does not change the shape of the distribution.
Linear Transformation: If Y = a+ bX is a linear transformation of the random variable X,
The distribution of Y has same shape as the probability distribution of X if b > 0
μY = a+ bμX
σY = |b|σX (because b could be a negative number)
Question: Score X of a randomly selected student on a test worth 50 points can be modeled by a Normal distribution with mean 35 and standard deviation 5. Professor decides to add 5 points to
each student’s score. Let Y be the scaled test score of the randomly selected student. Describe the shape, center, and variability of the probability distribution of Y.
Answer:
Shape: Approximately Normal
Center: µY = µX + 5 = 35 + 5 = 40
Variability: σY = σX = 5
Question: Student is full-time if taking 12 and 18 units. The number of units X that a randomly selected El Dorado Community College full-time student is taking (see histogram). Mean is µX = 14.65 and the standard deviation is σX = 2.056. Tuition is $50 per unit. If T = tuition charge for a
randomly selected full-time student, T = 50X.
Answer:
Shape: Same shape as the probability distribution of X, roughly symmetric with three peaks.
Center: µT = 50µX = 50(14.65) = $732.50
Variability: σT = 50σX = 50(2.056) = $102.80
Question: Let X = the number of passengers on a randomly selected trip at Pete’s Jeep Tours. Let Y = the number of passengers on a randomly selected trip at Erin’s Adventures. What is the sum S = X + Y of the number of passengers Pete and Erin will have on their tours on a randomly selected day?
Answer:
Binomial Setting is when n independent trials are committed with a sauce chance of an outcome (aka success) occurs. Test using BINS
Binary? Possible outcomes should be “success” or “failure”
Independent? Trials must be independent. Knowing the outcome of one trial can’t determine another trial.
Number? The number of trials must be fixed in advance
Same Probability? Probability must be of success p on each trial.
Binomial Random Variable is the count of successes X in a binomial setting (literally a discrete variable)
Binomial Distribution is the probability distribution of X.
n trials and probability p of success on each trial.
Binomial Coefficient arranges k successes with n trials.
(n/k) = (n! / ( k! * (n-k)! ))
Binomial Probability Formula uses the Binomial Coefficient
Actual formula on equation sheet - P(X = x) = (n / x)…
binompdf (n,p,k) computes P(X = k)
binomcdf (n,p,k) computes P(X ≤ k)
Steps to find Binomial Probabilities:
State the distribution and the values of interest. Specify a binomial distribution with the number of trials n, success probability p, and the values of the variable clearly identified.
Perform the calculations, show calculator steps (or formula)
Be sure to answer the question
Center & Variability (Only for Binomial Distributions):
Mean - μx = E(X) = np
SD: σX = root( np * (1 - p) )
10% Rule: A distribution will be approximately binomial when the sample size n is less than 10% of the population (Overrides BINS)
# of trials < 0.10 * population
Ex: A starburst bag comes with 200 candies. 25% chance one candy is red. This is not independent, and fails BINS. But since there are more than 200 starbursts in the world, the 10% rule applies. This distribution is approximately binomial.
Large Counts Condition: probability distribution of X is approximately normal if:
np ≧ 10 AND n( 1-p ) ≧ 10
AKA: Expected Numbers (mean) of successes and failures is at least 10
Geometric setting exists when the number of trials is unknown. Independent trials are performed until one success is made. Must be independent and p of success must be same.
Geometric Random Variable is the number of trials Y it takes to get a success
Geometric Probability Formula with Y geometric distribution, p probability of success, k is a specific term/trial
Actual formula on equation sheet - P(X = x) = ( 1 - p)…
geometpdf (p,k) computes P(X = k)
geometcdf (p,k) computes P(X ≤ k)
Shape: Every Geometric Distribution is skewed right (no exceptions)
Center: μx = E(X) = 1/p
SD: σX = root( 1 - p) / p
Random Variable represents the possible outcomes of a random process
AKA Random Result
Ex: X = number of heads in three tosses
Always a capital, italic letter (like X or Y)
The Probability distribution describes the likelihood of various outcomes (random variables), along with their probabilities.
Two Requirements:
Every probability pi is a number between 0 and 1, inclusive.
The sum of the probabilities is 1: p1 + p2 + p3 + . . . = 1.
A discrete random variable has specific, separate, and often whole number values / outcomes
Like (1, 2, 3, etc.) on a die
Discrete usually counts
Digital input in CS
Ex: like number of siblings
For random variabilities (probability in general), use mean instead of median
Mean is represented as mu or μ (usually with a subscript of the random variable)
Ex: Random Variable X, mean = μx
In a Histogram, analyze: shape, center, and variability
Shape Options: Skewed Left/Right & Symmetric
Symmetric can be bimodal or unimodal
Normal curve is always symmetric unimodal
Ex: The graph is roughly symmetric and has a single peak at $600
Mean (expected value) of a discrete random variable is the average value in a random process
Important Part: Expected Value takes frequency into consideration
Don’t take an average of Random Variables (X), since each random variables have different weights / frequencies than others.
Formula: μ_x = E(X) = x1p1 + x2p2 + x3p3 + … = ∑ xi * pi
Ti-Nispire Usage:
Spreadsheet → Enter Data into 2 Columns
Menu → Statics → Stat Calculations → Two-Variable Statistics
Answer is value of ∑ xy
Median is when the cumulative probability equals or exceeds 0.5
Standard Deviation aka σ (omega)
Ti-Nispire Usage:
Spreadsheet → Enter Data into 2 Columns
Menu → Statics → Stat Calculations → Two-Variable Statistics
Answer is value of σx
Variance is (Standard Deviation)²
A Continuous Random Variable takes any value in an interval on the number line (range)
Continuous usually measures
Analog input in Computer Science
Ex: height of a student or time to run a mile
To find its probabilities, use a Density Curve
1/9 × 9 = 1 (Probability always adds to 1)
Question: Would it be faster to take the train or walk? Walking takes 4 mins. What is the probability that Selena's train journey time to work is shorter than her walking time, given that her train journey time (Y) follows a uniform distribution from 2 to 5 minutes?
Solution:
Identify Continuous vs Discrete. This situation involves time (measured), so continuous .
Form the density curve: 5.0 - 2.0 = 3 → 3 × 1/3 = 1 (Probability adds to 1)
Calculate Shaded Area: base × height = (4.0 - 2.0 ) × 1/3 = 2/3
→ P(Y < 4) = 2/3 = 0.667
Sentence: There is a 66.7% chance that it will be quicker for Selena to take the train to work on a randomly selected day.
Question: The heights of young women can be modeled by a Normal distribution with mean µ = 64 inches and standard deviation σ = 2.7 inches. Suppose we choose a young woman at random and let Y = her height (in inches). Find P(68 ≤ Y ≤ 70). Interpret this value.
TODO: Insert Normal Curve Drawing
Answer: normalCdf(lower: 68, upper: 70, mean: 64, SD: 2.7) = 0.051
Adding the same positive number a to (subtracting a from) each value of a random variable:
Adds a to (subtracts a from) measures of center and location (mean, median, quartiles, percentiles).
Mean of Sum/Difference of Random Variables: μs = μx+-y = μx +- μY
Does not change measures of variability (range, IQR, standard deviation).
Variance adds (and subtracts), standard deviations do not add / subtract.
Does not change the shape of the probability distribution.
Multiplying (or dividing) each value of a random variable by the same positive number b:
Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b.
Multiplies (divides) measures of variability (range, IQR, standard deviation) by b.
Does not change the shape of the distribution.
Linear Transformation: If Y = a+ bX is a linear transformation of the random variable X,
The distribution of Y has same shape as the probability distribution of X if b > 0
μY = a+ bμX
σY = |b|σX (because b could be a negative number)
Question: Score X of a randomly selected student on a test worth 50 points can be modeled by a Normal distribution with mean 35 and standard deviation 5. Professor decides to add 5 points to
each student’s score. Let Y be the scaled test score of the randomly selected student. Describe the shape, center, and variability of the probability distribution of Y.
Answer:
Shape: Approximately Normal
Center: µY = µX + 5 = 35 + 5 = 40
Variability: σY = σX = 5
Question: Student is full-time if taking 12 and 18 units. The number of units X that a randomly selected El Dorado Community College full-time student is taking (see histogram). Mean is µX = 14.65 and the standard deviation is σX = 2.056. Tuition is $50 per unit. If T = tuition charge for a
randomly selected full-time student, T = 50X.
Answer:
Shape: Same shape as the probability distribution of X, roughly symmetric with three peaks.
Center: µT = 50µX = 50(14.65) = $732.50
Variability: σT = 50σX = 50(2.056) = $102.80
Question: Let X = the number of passengers on a randomly selected trip at Pete’s Jeep Tours. Let Y = the number of passengers on a randomly selected trip at Erin’s Adventures. What is the sum S = X + Y of the number of passengers Pete and Erin will have on their tours on a randomly selected day?
Answer:
Binomial Setting is when n independent trials are committed with a sauce chance of an outcome (aka success) occurs. Test using BINS
Binary? Possible outcomes should be “success” or “failure”
Independent? Trials must be independent. Knowing the outcome of one trial can’t determine another trial.
Number? The number of trials must be fixed in advance
Same Probability? Probability must be of success p on each trial.
Binomial Random Variable is the count of successes X in a binomial setting (literally a discrete variable)
Binomial Distribution is the probability distribution of X.
n trials and probability p of success on each trial.
Binomial Coefficient arranges k successes with n trials.
(n/k) = (n! / ( k! * (n-k)! ))
Binomial Probability Formula uses the Binomial Coefficient
Actual formula on equation sheet - P(X = x) = (n / x)…
binompdf (n,p,k) computes P(X = k)
binomcdf (n,p,k) computes P(X ≤ k)
Steps to find Binomial Probabilities:
State the distribution and the values of interest. Specify a binomial distribution with the number of trials n, success probability p, and the values of the variable clearly identified.
Perform the calculations, show calculator steps (or formula)
Be sure to answer the question
Center & Variability (Only for Binomial Distributions):
Mean - μx = E(X) = np
SD: σX = root( np * (1 - p) )
10% Rule: A distribution will be approximately binomial when the sample size n is less than 10% of the population (Overrides BINS)
# of trials < 0.10 * population
Ex: A starburst bag comes with 200 candies. 25% chance one candy is red. This is not independent, and fails BINS. But since there are more than 200 starbursts in the world, the 10% rule applies. This distribution is approximately binomial.
Large Counts Condition: probability distribution of X is approximately normal if:
np ≧ 10 AND n( 1-p ) ≧ 10
AKA: Expected Numbers (mean) of successes and failures is at least 10
Geometric setting exists when the number of trials is unknown. Independent trials are performed until one success is made. Must be independent and p of success must be same.
Geometric Random Variable is the number of trials Y it takes to get a success
Geometric Probability Formula with Y geometric distribution, p probability of success, k is a specific term/trial
Actual formula on equation sheet - P(X = x) = ( 1 - p)…
geometpdf (p,k) computes P(X = k)
geometcdf (p,k) computes P(X ≤ k)
Shape: Every Geometric Distribution is skewed right (no exceptions)
Center: μx = E(X) = 1/p
SD: σX = root( 1 - p) / p