STAT 318 Lecture Notes: Probability - Discrete Random Variables, PMF, and CDF

Independent detection experiments: three independent trials and two detections

Problem setup: impurity may be present or not; in each trial you either detect it or you do not. We consider presence vs absence as two states with their own detection behavior.
Core question: Given that in three independent experiments you observe exactly two detections out of three, what is the probability the impurity is actually present? (i.e., P(Present | exactly 2 detections))
Step 1: Determine the per-trial detection probability.
- There are two ways to get a detection in a single trial (two paths to a detection).
- The per-trial probability of detection is the sum over the two paths; in general this can be written as
  p_{ ext{det}} \,=\, P(\text{detect} \,|\, \text{present}) \cdot P(\text{present}) \
- P(\text{detect} \,|\, \text{not present}) \cdot P(\text{not present}).
- The lecture’s example uses numbers (e.g., a per-trial detection probability around 0.38 in a particular scenario) to illustrate calculation; the exact numbers depend on the underlying prior and detection model.
Step 2: Use the binomial model for 3 independent trials with detection probability p.
- The probability of exactly two detections in three trials is
  P(K=2) = \binom{3}{2} p^2 (1-p).
- There are 3 possible sequences that yield two detections among three trials (e.g., D D N, D N D, N D D).
- If p = 0.38 (per-trial detection probability in the example), then
  P(K=2) = 3 \times (0.38)^2 \times (1-0.38) = 3 \times 0.1444 \times 0.62 \approx 0.268.
Step 3: Bayesian posterior for presence given two detections.
- Let P(P) be the prior probability that the impurity is present. Let pp = P(D|P) be the per-trial detection probability when present, and pn = P(D|\neg P) when not present.
- Then
  P(K=2 \mid P) = \binom{3}{2} pp^{2} (1-pp), \ P(K=2 \mid \neg P) = \binom{3}{2} pn^{2} (1-pn).
- The posterior is
  P(P \mid K=2) = \frac{P(K=2 \mid P) \; P(P)}{P(K=2 \mid P) \; P(P) + P(K=2 \mid \neg P) \; P(\neg P)}.
- Equivalently (cancelling the common factor 3),
  P(P \mid K=2) = \frac{pp^{2} (1-pp) P(P)}{pp^{2} (1-pp) P(P) + pn^{2} (1-pn) (1-P(P))}.
Quick interpretation and takeaway:
- More detections in multiple trials increases evidence for presence, but the strength depends on the relative likelihoods pp and pn and the prior P(P).
- If p_n is very small or zero (very unlikely to detect when not present), observing two detections in three trials strongly supports presence (high P(P|K=2)).
Related notes from the lecture:
- Trials are assumed independent.
- The calculation illustrated the two-step process: (a) determine per-trial detection probability from the model, (b) aggregate across 3 trials with the binomial distribution, and (c) optionally update with Bayes’ rule given a prior probability of presence.

Chapter 2: Discrete vs. Continuous Distributions and Random Variables

Core idea: probability theory uses models for how numerical values arise from experiments; we categorize into discrete vs continuous frameworks.
Discrete vs continuous (high-level distinction)
- Discrete: the variable can take on a countable set of values (finite or countably infinite). Examples: {0,1,2,3,4}, {1,2,3,4,5,6} for a die, number of heads in coin flips.
- Continuous: the variable can take on any value in an interval (uncountably many values). Examples: a real-valued measurement like height in cm, distance, time.
Why this matters: choosing the right framework determines how we define probabilities, distributions, and densities.
Random variable definition (in general terms)
- A random variable X is a function that assigns a numerical value to each outcome in the sampling space S:
  X: S \to \mathbb{R}.
- In practice, we usually work with a particular kind of random variable called a discrete random variable when the set of possible values is discrete.
- Notation convention used in the lecture:
- The variable is written as a capital letter X.
- A realized value is written as a lowercase x (e.g., x ∈ {0,1,2,3,4}).
Example: discrete random variable X = number of under-inflated tires on a pump
- Possible values: X \in {0,1,2,3,4}. (0 = no under-inflated tires, 4 = all tires under-inflated)
- Note: the same physical experiment can yield multiple random variables (e.g., number of heads, number of tails, etc.).
Example: height of a PSU student
- This is a continuous random variable (values form an interval, e.g., between 150 cm and 190 cm with potentially any real value in that range).
Formal definition (via a random experiment and sampling space)
- A random experiment has a sampling space S.
- A random variable X is a function from S to the real numbers, assigning a real value to each outcome.
Random variable outputs vs. ranges
- For a discrete variable, the set of possible outputs is a discrete set (e.g., {0,1,2,3,4}).
- For a continuous variable, outputs form an interval (e.g., [a,b]).
Common vocabulary and notation
- PMF: Probability Mass Function, denoted f_X(x) or P(X = x). This is defined only for discrete X.
- PDF: Probability Density Function, used for continuous X. Probabilities are derived via integration, not summation.
- CDF: Cumulative Distribution Function, defined for both discrete and continuous, as F_X(x) = P(X ≤ x).
Discrete random variables: outputs and their probabilities
- For a discrete X, the probability mass function fX(x) gives the probability that X equals the value x: fX(x) = P(X = x).
- Properties of a PMF (for all x in the support):
  fX(x) \ge 0, \quad \sum{x} f_X(x) = 1.
- If A is a subset of S, the probability of A is the sum of probabilities of the outcomes in A:
  P(A) = \sum{x \in A} fX(x).
Visual representations for discrete distributions
- Bar chart: plot a bar at each possible value x with height f_X(x).
- Histogram (for discrete distributions): bars fill the space between adjacent x-values; note that probability mass remains at the specific discrete points, not in the gaps.
Examples of discrete distributions
- Uniform distribution on a finite set: if X takes values in a finite set with equal probability, then
  f_X(x) = \frac{1}{|\text{support}|} \quad \text{for } x \text{ in the support},
  and 0 otherwise.
- A discrete distribution with support {2,3,4} and a given PMF: e.g.,
  fX(2) = \frac{2}{9}, \quad fX(3) = \frac{2}{9}, \quad f_X(4) = \frac{4}{9}.
- Other discrete examples mentioned: a die roll, sums of two dice, or a discrete score range like SAP scores (e.g., 200 to 1600 with increments of 1).
Continuous distributions (brief mention)
- Instead of PMF, continuous distributions use a probability density function pX(x) with \int{-\infty}^{\infty} p_X(x)\,dx = 1.
- Probabilities over intervals are calculated by integration rather than summation.
The cumulative distribution function (CDF)
- Definition (for any X, discrete or continuous):
  FX(x) = P(X \le x) = \sum{t \le x} f_X(t) (for discrete X; for continuous X, the sum becomes an integral).
- Relationship to the PMF (discrete case): FX(x) is a step function that increases at each point in the support by an amount fX(x).
- In the discrete case, one computes FX at a value x by summing probabilities of all outcomes up to x, e.g.: FX(x) = \sum{t \le x} fX(t).
Example calculations with a discrete tire problem (illustrative)
- Suppose X has support including 0, 1, 2, 3, 4 with PMF values fX(0), fX(1), fX(2), fX(3), f_X(4).
- Then:
  FX(0) = P(X \le 0) = fX(0),
  FX(1) = P(X \le 1) = fX(0) + fX(1), FX(2) = P(X \le 2) = fX(0) + fX(1) + f_X(2),
- And so on; for x < 0, FX(x) = 0; for x at or above the largest value, FX(x) = 1.
Important conceptual notes
- The notation can be confusing (lowercase f for PMF vs uppercase F for CDF). In discrete settings, F_X(x) is the cumulative sum of the PMF, i.e. a running total of probabilities up to x.
- The piecewise nature of F_X makes it a step function for discrete X, and monotone nondecreasing for all X.
Summary of key connections
- Discrete X: values are countable; probabilities are given by the PMF fX(x); probabilities of events by summing fX(x) over the appropriate x’s.
- CDF F_X(x) gives a single function that encodes the entire distribution via P(X ≤ x).
- Visualization through bars or histograms helps interpret the PMF, while the CDF provides cumulative probabilities and monotone behavior.
Quick recall questions (conceptual)
- What are the two main types of random variables? (Discrete and Continuous)
- How do you compute P(X ∈ A) for A a subset of S? (Sum f_X(x) over x ∈ A)
- How do you compute FX(x) in the discrete case? (Sum fX(t) for t ≤ x)
- What is the relationship between PMF and CDF? (CDF is the cumulative sum of the PMF)
Final note from the lecture
- More practice with discrete distributions and CDFs is coming up in the homework, including a problem that asks you to compare answers with a neighbor.

Quick reference formulas (summary)

Per-trial detection probability from a two-path detection model:
p_{ ext{det}} \;=\, P(\text{detect} | \text{present}) P(\text{present}) \, + \, P(\text{detect} | \text{not present}) P(\text{not present}).
Exactly two detections in three trials (binomial model):
P(K=2) \;=\, \binom{3}{2} p^{2} (1-p).
Posterior probability (Bayes) that impurity is present given K=2 detections:
P(P \mid K=2) \;=\, \frac{p{p}^{2} (1-p{p}) P(P)}{p{p}^{2} (1-p{p}) P(P) \, + \, p{n}^{2} (1-p{n}) (1-P(P))}.
PMF for discrete X (definition):
fX(x) = P(X = x). Conditions: fX(x) \ge 0 \quad \text{and} \quad \sumx fX(x) = 1.
Event probability via PMF: for A ⊆ S,
P(A) = \sum{x \in A} fX(x).
CDF (definition, discrete):
FX(x) = P(X \le x) = \sum{t \,:\ t \le x} f_X(t).
For continuous X, replace sums with integrals:
FX(x) = \int{-\infty}^{x} p_X(t) \, dt.
Uniform discrete distribution (example):
f_X(x) = \frac{1}{|\text{support}|} \quad \text{for } x \in \text{support}, \quad 0 \text{ otherwise}.
Example discrete PMF (2,3,4 with given probs):
fX(2) = \frac{2}{9}, \quad fX(3) = \frac{2}{9}, \quad f_X(4) = \frac{4}{9}.