Comprehensive Study Notes: Probability and the Sampling Distribution of the Sample Mean

Variability and Randomness

  • Statistics is fundamentally the study of variability.
  • Uncertainty is managed by investigating random behavior.
  • Random behavior is characterized by two distinct patterns relative to time:
    • Short-run: Outcomes are unpredictable and appear haphazard.
    • Long-run: Outcomes exhibit a regular, predictable distribution.
  • A phenomenon is defined as random if individual outcomes are uncertain, but a stable distribution of outcomes emerges over a large number of repetitions.
  • A random experiment is defined as any process or activity involving uncertainty that results in two or more possible outcomes.
  • In everyday language, "randomness" is often equated with chaos or haphazard events because we often do not observe the phenomenon enough times to perceive the emerging long-run pattern.

Understanding Probability

  • The foundation of probability lies in the fact that regular patterns emerge only after many repeated trials (e.g., rolling dice, tossing coins, or lottery outcomes).
  • The Coin Toss Experiment:
    • Assuming a fair coin, the likelihood of observing a Head is equal to observing a Tail (50% chance each).
    • In a sequence like H,T,T,H,T,HH, T, T, H, T, H, the observed proportions of Heads are 1.0,0.5,0.33,0.5,0.4,0.51.0, 0.5, 0.33, 0.5, 0.4, 0.5.
    • Proportions vary significantly in the early stages, but in the long-run, the proportion of Heads will consistently stay very close to 0.50.5.
    • Empirical Threshold: If a fair coin is tossed 10,00010,000 times, it is almost certain that one will observe between 4,8004,800 and 5,2005,200 Heads.
  • Definition of Probability: The probability of any outcome of a random phenomenon is the proportion of times that specific outcome would occur in an infinitely long series of trials.
  • Probability Theory: This branch of mathematics describes random behavior using mathematical models. Because we cannot perform an experiment infinite times, we use models to describe what would happen theoretically.

Proportions vs. Probability

  • Proportion: A value that is known or has been observed. It is spoken of in the present tense.
  • Probability: A theoretical value representing the proportion after an infinitely long series of trials. It relates to future events.

Probability Models and Sample Spaces

  • A probability model consists of two components:
    1. A list of all possible outcomes.
    2. A probability assigned to each outcome.
  • The Sample Space (SS): The set of all possible outcomes for a random phenomenon.
    • Simple Examples:
      • Tossing a coin once: S={H,T}S = \{H, T\}.
      • Tossing a coin three times: S={HHH,HHT,HTH,HTT,THH,THT,TTH,TTT}S = \{HHH, HHT, HTH, HTT, THH, THT, TTH, TTT\}.
    • Complex Examples:
      • Lotto 6/49: Choosing six numbers from 49 leads to nearly 14,000,00014,000,000 possible combinations.
      • Sports (Valour FC soccer): Considering the next two games (WW = Win, TT = Tie, LL = Loss), the order matters. S={WW,WT,WL,TW,TT,TL,LW,LT,LL}S = \{WW, WT, WL, TW, TT, TL, LW, LT, LL\}. Note that Winning first then Losing (WLWL) is distinct from Losing first then Winning (LWLW).
      • Rolling Two Dice: The sample space contains 36 outcomes (11,12,,6611, 12, \dots, 66). If the variable of interest is the sum of the two dice, then S={2,3,4,5,6,7,8,9,10,11,12}S = \{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12\}.

Rules and Probability Distributions

  • For a sample space S={O1,O2,,On}S = \{O_1, O_2, \dots, O_n\}, let the probability of individual outcome OiO_i be denoted as pip_i.
  • Fundamental Conditions:
    1. Each individual probability must be between 0 and 1: 0pi10 \le p_i \le 1 for all i=1,2,,ni = 1, 2, \dots, n.
    2. The sum of all probabilities must equal exactly 1: p1+p2++pn=1p_1 + p_2 + \dots + p_n = 1.
  • Events:
    • An event is a subset of outcomes from the sample space.
    • Example (Rolling two dice): If event AA is "At Least One 4", then A={14,24,34,41,42,43,44,45,46,54,64}A = \{14, 24, 34, 41, 42, 43, 44, 45, 46, 54, 64\}. If event BB is "Sum is 9", then B={36,45,54,63}B = \{36, 45, 54, 63\}.
  • Probability of Events: Calculated by adding the probabilities of all individual outcomes contained within that event.
    • Example: P(Sum of dice >8)=P(X=9)+P(X=10)+P(X=11)+P(X=12)=436+336+236+136=10360.2778P(\text{Sum of dice } > 8) = P(X=9) + P(X=10) + P(X=11) + P(X=12) = \frac{4}{36} + \frac{3}{36} + \frac{2}{36} + \frac{1}{36} = \frac{10}{36} \approx 0.2778.
  • Complements (AcA^c): The event containing all outcomes in the sample space not found in AA.
    • Rule: P(Ac)=1P(A)P(A^c) = 1 - P(A).
    • Examples: P(Win)=1P(Tie or Lose)P(\text{Win}) = 1 - P(\text{Tie or Lose}); P(Sum 5)=1P(Sum 4)P(\text{Sum } \ge 5) = 1 - P(\text{Sum } \le 4).
  • Probability Distribution: A table or rule that provides all possible values of a variable and the specific probability for each value.

Random Variables and Calculations

  • A random variable (XX) provides a numerical description of the outcome of a statistical experiment.
  • Case Study: NHL Atlantic Division Division Winner:
    • Teams: Montreal, Ottawa, Toronto, and others.
    • Probabilities: Montreal (kk, Ottawa (0.050.05), Toronto (4k4k), Team 4 (2k2k), Team 5 (0.020.02), Team 6 (0.040.04), Team 7 (0.290.29), Team 8 (3k3k).
    • Calculation for kk: k+0.05+4k+2k+0.02+0.04+0.29+3k=110k+0.40=110k=0.60k=0.06k + 0.05 + 4k + 2k + 0.02 + 0.04 + 0.29 + 3k = 1 \Rightarrow 10k + 0.40 = 1 \Rightarrow 10k = 0.60 \Rightarrow k = 0.06.
    • Probability a Canadian team wins: P(Montreal)+P(Ottawa)+P(Toronto)=0.06+0.05+4(0.06)=0.35P(\text{Montreal}) + P(\text{Ottawa}) + P(\text{Toronto}) = 0.06 + 0.05 + 4(0.06) = 0.35.
  • Case Study: NHL Pacific Division (Complementary Logic):
    • Probabilities provided: Calgary (0.080.08), Edmonton (0.270.27), Vancouver (0.090.09). Others are incomplete.
    • Probability an American team wins: 1P(Canadian team wins)=1(0.08+0.27+0.09)=10.44=0.561 - P(\text{Canadian team wins}) = 1 - (0.08 + 0.27 + 0.09) = 1 - 0.44 = 0.56.

Continuous Random Variables

  • While discrete random variables (like dice sums or coin counts) take only certain values, continuous random variables can take any value in an interval.
  • Sample Space Example: Time for a light bulb to burn out, S={all values of x such that x0}S = \{\text{all values of } x \text{ such that } x \ge 0\}.
  • Probability Assignment: Because there are infinitely many outcomes, probabilities are assigned to intervals of values rather than individual points.
  • Density Curves: The area under a density curve represents the probability of observing an outcome in that interval.
  • Normal Probability Distribution: Denoted as XN(μ,σ)X \sim N(\mu, \sigma), where probabilities correspond to the area under the Normal curve.
  • Example (Pulse Rates): Adult females have pulse rates with μ=74bpm\mu = 74\,\text{bpm} and σ=12bpm\sigma = 12\,\text{bpm}.
    • P(X>57)=P(Z>577412)=P(Z>1.42)=1P(Z<1.42)=10.0778=0.9222P(X > 57) = P\left(Z > \frac{57 - 74}{12}\right) = P(Z > -1.42) = 1 - P(Z < -1.42) = 1 - 0.0778 = 0.9222.

The Sampling Distribution of the Sample Mean ($\bar{X}$)

  • Instead of observing single individuals, researchers often take a Random Sample of size nn and calculate the sample mean (xˉ\bar{x}).
  • The Sampling Distribution of a Statistic is the distribution of values taken by that statistic in all possible samples of the same size from the same population.
  • Conceptual Experiment:
    • Repeatedly take samples of size nn from a population with mean μ\mu and standard deviation σ\sigma.
    • Calculate xˉ\bar{x} for each sample.
    • Plot the histogram of xˉ\bar{x} values.
  • Key Characteristics of the Distribution of Xˉ\bar{X}:
    1. The mean of the sampling distribution is equal to the population mean (μXˉ=μ\mu_{\bar{X}} = \mu).
    2. The standard deviation (Standard Error) of the sampling distribution is lower than the population standard deviation: σXˉ=σn\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}}.
    3. Averages are consistently less variable than individual observations.

The Central Limit Theorem (CLT)

  • The Theorem: When taking a Simple Random Sample (SRS) of size nn from any population with mean μ\mu and standard deviation σ\sigma, the sampling distribution of Xˉ\bar{X} is approximately Normal if nn is sufficiently large.
  • Notation: XˉN(μ,σn)\bar{X} \approx N\left(\mu, \frac{\sigma}{\sqrt{n}}\right).
  • Significance: The original population distribution does not need to be symmetric or normal. As nn increases, the skewness of the original distribution is overcome.
  • Sample Size Guidelines:
    • For symmetric distributions, Xˉ\bar{X} becomes normal at very low nn.
    • For strongly skewed distributions, a higher nn is required.
    • Course Rule: It is safe to apply the CLT when n30n \ge 30.

Practical Examples and R Code

  • Male Heights Case Study: Population XN(178,6)X \sim N(178, 6).

    • Individual Probability: P(X>180)=P(Z>1801786)=P(Z>0.33)=0.3707P(X > 180) = P\left(Z > \frac{180 - 178}{6}\right) = P(Z > 0.33) = 0.3707.
    • Sample Probability (n=10n=10): P(Xˉ>180)=P(Z>180178610)=P(Z>1.05)0.1469P(\bar{X} > 180) = P\left(Z > \frac{180 - 178}{\frac{6}{\sqrt{10}}}\right) = P(Z > 1.05) \approx 0.1469.
    • R Code for Sample Mean: pnorm(180, 178, 6/sqrt(10), lower.tail = FALSE) yields 0.1459203.
  • Light Bulb Lifetimes Case Study:

    • Population is right-skewed with μ=400hr\mu = 400\,hr and σ=250hr\sigma = 250\,hr.
    • For n=40n = 40 bulbs, calculate the probability the mean lifetime exceeds 450hr450\,hr.
    • Since n30n \ge 30, use CLT: P(Xˉ>450)P(Z>45040025040)=P(Z>1.26)=0.1038P(\bar{X} > 450) \approx P\left(Z > \frac{450 - 400}{\frac{250}{\sqrt{40}}}\right) = P(Z > 1.26) = 0.1038.
  • Metal Bolts Case Study (μ=1.25cm\mu = 1.25\,cm, σ=0.05cm\sigma = 0.05\,cm):

    • Probability that an SRS of n=100n = 100 has a mean diameter between 1.24cm1.24\,cm and 1.26cm1.26\,cm.
    • P(1.24<Xˉ<1.26)P(1.241.250.05100<Z<1.261.250.05100)=P(2.00<Z<2.00)=0.9544P(1.24 < \bar{X} < 1.26) \approx P\left(\frac{1.24 - 1.25}{\frac{0.05}{\sqrt{100}}} < Z < \frac{1.26 - 1.25}{\frac{0.05}{\sqrt{100}}}\right) = P(-2.00 < Z < 2.00) = 0.9544.
    • Constraint: If we only select n=5n = 5 bolts, and the underlying distribution of XX is unknown, we cannot calculate the probability because the sample size is too small for the CLT.

Summary Classification for $\bar{X}$

  • Scenario 1: Population is Normal:
    • XN(μ,σ)X \sim N(\mu, \sigma).
    • Result: Xˉ\bar{X} is exactly Normal for any sample size nn.
  • Scenario 2: Population is Not Normal / Unknown:
    • If n30n \ge 30: Xˉ\bar{X} is approximately Normal by the CLT.
    • If n<30n < 30: Xˉ\bar{X} is not normal; standard probability techniques cannot be applied.
  • Universal Truth: For any distribution, the mean of the sample mean is μ\mu and the standard deviation is σn\frac{\sigma}{\sqrt{n}}.