Module 4, Section 4: Probability Model

Probability Model: Fundamental Concepts and Applications

1. Introduction to Probability Models and Expected Value

Definition: A probability model describes the possible outcomes of a probabilistic experiment and assigns probabilities to each event.
Expected Value (E(X)): Represents the long-run average outcome of a random variable. It is a weighted average of all possible values, where the weights are the probabilities of those values.
Example: Investment Strategies
- Strategy 1: 20\% chance to profit 1,000, 80\% chance to lose 100.
  - Expected Profit: E(\text{profit with strategy 1}) = (-100)(0.8) + (1000)(0.2) = -80 + 200 = 120.
- Strategy 2: 15\% chance to profit 12,000, 85\% chance to lose 2,000.
  - Expected Profit: E(\text{profit with strategy 2}) = (-2000)(0.85) + (12000)(0.15) = -1700 + 1800 = 100.
- Conclusion: Strategy 1 is better in the long run because it has a higher expected profit of 120 compared to 100 for Strategy 2.

2. Properties of Discrete Random Variables

Notation: Let X be a discrete random variable.
Expected Value (Mean): Denoted by E(X), \muX, or \mu. It is calculated as the sum of each possible value multiplied by its probability: E(X) = \muX = \mu = \sum xi P(xi)
Variance: Denoted by \text{Var}(X), \sigmaX^2, or \sigma^2. It measures the spread or dispersion of the distribution around the mean. It is calculated as the sum of the squared differences between each value and the mean, weighted by their probabilities: \text{Var}(X) = \sigmaX^2 = \sigma^2 = \sum (xi - \mu)^2 P(xi)
Standard Deviation (SD): Denoted by SD(X), \sigmaX, or \sigma. It is the square root of the variance and provides a measure of variability in the original units of the random variable. SD(X) = \sigmaX = \sigma = \sqrt{\text{Var}(X)}

3. Transformations of Random Variables

These properties apply to both discrete and continuous random variables.

Shifting (Adding or Subtracting a Constant a):
- Expected Value: When a constant is added to or subtracted from a random variable, the mean shifts by the same constant.
  E(X \pm a) = E(X) \pm a
- Standard Deviation: Adding or subtracting a constant does not change the spread of the distribution, so the standard deviation remains the same.
  SD(X \pm a) = SD(X)
Scaling (Multiplying or Dividing by a Constant b):
- Expected Value: When a random variable is multiplied by a constant, the mean is also multiplied by that constant.
  E(bX) = bE(X)
- Standard Deviation: When a random variable is multiplied by a constant, the standard deviation is multiplied by the absolute value of that constant.
  SD(bX) = |b|SD(X)
- Variance: The variance is multiplied by the square of the constant.
  \text{Var}(bX) = b^2 \text{Var}(X)

4. Combining Two Independent Random Variables (X and Y)

These properties apply for both discrete and continuous random variables, assuming independence for variance calculations.

Expected Value of a Sum or Difference: The expected value of the sum or difference of two random variables is the sum or difference of their individual expected values.
E(X \pm Y) = E(X) \pm E(Y)
Variance of a Sum or Difference (for Independent Variables): For independent random variables, the variance of their sum or difference is the sum of their individual variances. \text{Var}(X \pm Y) = \text{Var}(X) + \text{Var}(Y)
- Important Note: Even when calculating the variance of a difference, the individual variances are added.
Standard Deviation of a Sum or Difference: The standard deviation of the sum or difference is the square root of the sum of their variances. SD(X \pm Y) = \sqrt{\text{Var}(X \pm Y)} = \sqrt{\text{Var}(X) + \text{Var}(Y)}
- Crucial Distinction: The standard deviation of a sum or difference is not equal to the sum or difference of the individual standard deviations. That is, SD(X \pm Y) \neq SD(X) \pm SD(Y).

5. Examples and Applications of Random Variable Properties

5.1 Students Watching Movies (Conceptual Example)

Let X be the number of students who like to watch a movie in a randomly selected sample of two students.
Assume, for instance, that p is the proportion of students who like to watch a movie, and the selections are independent.
- P(X=0) (Neither likes movies) = (1-p) \times (1-p)
- P(X=1) (One likes movies) = p \times (1-p) + (1-p) \times p = 2p(1-p)
- P(X=2) (Both like movies) = p \times p
Once the probability distribution is developed (i.e., values for p are known), then E(X), \text{Var}(X), and SD(X) can be calculated using the formulas above.

5.2 Calculator Ownership Example (Applying Shifting and Scaling)

Scenario 1: Each Group A student buys an extra calculator.
- Let X_A be the original number of calculators for a Group A student.
- New number of calculators: X_A + 1.
- New average: E(XA + 1) = E(XA) + 1 (The average increases by 1).
- New standard deviation: SD(XA + 1) = SD(XA) (The spread remains the same).
Scenario 2: Each Group B student doubles their calculators.
- Let X_B be the original number of calculators for a Group B student.
- New number of calculators: 2X_B.
- New average: E(2XB) = 2E(XB) (The average doubles).
- New standard deviation: SD(2XB) = 2SD(XB) (The spread doubles).

5.3 Combining Calculator Ownership for Multiple Students

Scenario 1: Total calculators for 2 randomly selected independent Group B students (X{B1} + X{B2}).
- Mean of total: E(X{B1} + X{B2}) = E(X{B1}) + E(X{B2}).
- Standard deviation of total: SD(X{B1} + X{B2}) = \sqrt{\text{Var}(X{B1}) + \text{Var}(X{B2})} = \sqrt{SD(X{B1})^2 + SD(X{B2})^2}. (Assuming X{B1} and X{B2} are independent and identically distributed, if not, use distinct values).
Scenario 2: Total calculators for 1 Group A and 1 Group B student (XA + XB), independent.
- Mean of total: E(XA + XB) = E(XA) + E(XB).
- Standard deviation of total: SD(XA + XB) = \sqrt{\text{Var}(XA) + \text{Var}(XB)} = \sqrt{SD(XA)^2 + SD(XB)^2}.
Scenario 3: Difference in calculators between 1 Group A and 1 Group B student (XA - XB), independent.
- Mean of difference: E(XA - XB) = E(XA) - E(XB).
- Standard deviation of difference: SD(XA - XB) = \sqrt{\text{Var}(XA) + \text{Var}(XB)} = \sqrt{SD(XA)^2 + SD(XB)^2}. (Note: variances are still added).

5.4 Heights of High School Students Example

School A: E(XA) = 71 inches, SD(XA) = 4.5 inches.
School B: E(XB) = 70 inches, SD(XB) = 3 inches.
Assume heights are normally distributed and independent.
Let D = XA - XB (height difference).
- Expected difference: E(D) = E(XA) - E(XB) = 71 - 70 = 1 inch.
  - Interpretation: On average, a student from School A is expected to be 1 inch taller than a student from School B.
- Variance of the height difference: \text{Var}(D) = \text{Var}(XA) + \text{Var}(XB) = (4.5)^2 + (3)^2 = 20.25 + 9 = 29.25 \text{ in}^2.
- Standard deviation of the height difference: SD(D) = \sqrt{\text{Var}(D)} = \sqrt{29.25} \approx 5.41 inches.

5.5 Andy (X) and Benny (Y) Study Times Example

Both X and Y are independent random variables, normally distributed.
Given: E(X) = 4 hrs, SD(X) = 1 hr; E(Y) = 5 hrs, SD(Y) = 2 hrs.
Scenario 1: Benny increases study time by 2 hours (Y + 2).
- Mean: E(Y + 2) = E(Y) + 2 = 5 + 2 = 7 hours.
- Standard Deviation: SD(Y + 2) = SD(Y) = 2 hours.
Scenario 2: Andy doubles study time (2X).
- Mean: E(2X) = 2E(X) = 2 \times 4 = 8 hours.
- Standard Deviation: SD(2X) = 2SD(X) = 2 \times 1 = 2 hours.
Scenario 3: Total study time for Andy over 2 days (X1 + X2), assuming independence.
- Mean: E(X1 + X2) = E(X1) + E(X2) = 4 + 4 = 8 hours.
- Variance: \text{Var}(X1 + X2) = \text{Var}(X1) + \text{Var}(X2) = (1)^2 + (1)^2 = 1 + 1 = 2.
- Standard Deviation: SD(X1 + X2) = \sqrt{2} \approx 1.414 hours.
Scenario 4: Difference between Andy's and Benny's study time (X - Y).
- Mean: E(X - Y) = E(X) - E(Y) = 4 - 5 = -1 hour.
- Variance: \text{Var}(X - Y) = \text{Var}(X) + \text{Var}(Y) = (1)^2 + (2)^2 = 1 + 4 = 5.
- Standard Deviation: SD(X - Y) = \sqrt{5} \approx 2.236 hours.

5.6 Probabilities with Normally Distributed Study Times

Z-Table: A table that provides the cumulative probabilities for a standard normal distribution (mean 0, standard deviation 1). To use it, a random variable X must be converted into a Z-score: Z = (X - \mu) / \sigma.
Probability: Andy studies more than 6.5 hours today (P(X > 6.5) )
- Z = (6.5 - E(X)) / SD(X) = (6.5 - 4) / 1 = 2.5.
- P(X > 6.5) = P(Z > 2.5) = 1 - P(Z \le 2.5).
- Using Z-table: P(Z \le 2.50) = 0.9938.
- Result: 1 - 0.9938 = 0.0062.
Study time for Andy that corresponds to the longest 10% study period (P(X > x) = 0.10 )
- This means P(X \le x) = 0.90.
- Find Z-score for cumulative probability 0.90 from the Z-table: Z \approx 1.28 (corresponding to 0.8997).
- Solve for x: x = E(X) + Z \times SD(X) = 4 + 1.28 \times 1 = 5.28 hours.
Probability: Benny studies less than 2 hours today (P(Y < 2) )
- Z = (2 - E(Y)) / SD(Y) = (2 - 5) / 2 = -3 / 2 = -1.5.
- P(Y < 2) = P(Z < -1.5).
- Using Z-table: P(Z \le -1.50) = 0.0668.
Probability: Andy studies more than Benny today (P(X > Y) = P(X - Y > 0) )
- From previous calculations: E(X - Y) = -1, SD(X - Y) = \sqrt{5} \approx 2.236.
- Z = (0 - E(X - Y)) / SD(X - Y) = (0 - (-1)) / \sqrt{5} = 1 / \sqrt{5} \approx 0.447.
- P(X - Y > 0) = P(Z > 0.447) = 1 - P(Z \le 0.447).
- Using Z-table for Z = 0.45 (nearest value): P(Z \le 0.45) = 0.6736.
- Result: 1 - 0.6736 = 0.3264.
Probability: Andy studies less than 5 hours in total over the next two days (P(X1 + X2 < 5) )
- From previous calculations: E(X1 + X2) = 8, SD(X1 + X2) = \sqrt{2} \approx 1.414.
- Z = (5 - E(X1 + X2)) / SD(X1 + X2) = (5 - 8) / \sqrt{2} = -3 / \sqrt{2} \approx -2.121.
- P(X1 + X2 < 5) = P(Z < -2.121).
- Using Z-table for Z = -2.12: P(Z \le -2.12) = 0.0170.

5.7 Coke Filling Machine Example

Assume fills (Xi) are independent and normally distributed with E(Xi) = 12.1 oz and SD(X_i) = 0.2 oz.
Total Contents in a 6-pack (Total = X1 + X2 + \dots + X_6)
- Expected Value: E(\text{Total}) = E(X1) + E(X2) + \dots + E(X_6) = 6 \times E(X) = 6 \times 12.1 = 72.6 oz.
- Variance: \text{Var}(\text{Total}) = \text{Var}(X1) + \text{Var}(X2) + \dots + \text{Var}(X_6) = 6 \times \text{Var}(X) = 6 \times (0.2)^2 = 6 \times 0.04 = 0.24. (Since fills are independent)
- Standard Deviation: SD(\text{Total}) = \sqrt{\text{Var}(\text{Total})} = \sqrt{0.24} \approx 0.4899 oz.
Average Contents in a 6-pack (\bar{X} = (X1 + X2 + \dots + X_6) / 6)
- Expected Value: E(\bar{X}) = E( (1/6) \times \text{Total} ) = (1/6) \times E(\text{Total}) = (1/6) \times 72.6 = 12.1 oz.
- Variance: \text{Var}(\bar{X}) = \text{Var}( (1/6) \times \text{Total} ) = (1/6)^2 \times \text{Var}(\text{Total}) = (1/36) \times 0.24 = 0.006667.
- Standard Deviation: SD(\bar{X}) = \sqrt{\text{Var}(\bar{X})} = \sqrt{0.006667} \approx 0.0816 oz.
  - Alternative Formula for SD(\bar{X}): For a sample mean of n independent observations from a population with standard deviation \sigma, the standard deviation of the sample mean is given by SD(\bar{X}) = \sigma / \sqrt{n}. In this case, 0.2 / \sqrt{6} \approx 0.2 / 2.449 \approx 0.0816.