1/45
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is a discrete random variable?
A variable that can take only specified separate values, each with an associated probability.
What is a probability function P(X = x)?
A rule (table or formula) giving the probability that the random variable X takes each value x.
What two properties must hold for any probability distribution?
All probabilities are non-negative, and the probabilities sum to 1.
What is the expectation E(X) of a discrete random variable?
E(X) = Σ x·P(X = x), summed over all possible values x. It's the long-run mean.
Notation: what does μ mean in this context?
μ = E(X), the mean (expectation) of the random variable X.
Formula for the variance Var(X) of a discrete random variable
Var(X) = E((X − μ)²) = E(X²) − μ² = Σ x²·P(X = x) − (E(X))².
What is the standard deviation of X?
σ = √Var(X).
E(aX + b) = ?
aE(X) + b.
Var(aX + b) = ?
b²Var(X). Note: the constant b disappears, and a is squared.
E(X ± Y) = ?
E(X) ± E(Y). True for ANY two random variables (independent or not).
Var(X ± Y) for independent X and Y
Var(X) + Var(Y). Note: the variances ADD even when subtracting.
E(aX ± bY) for any X, Y
aE(X) ± bE(Y).
Var(aX ± bY) for independent X and Y
a²Var(X) + b²Var(Y).
Why do variances add (not subtract) when computing Var(X − Y)?
Variance measures spread, which is always non-negative; subtracting independent variables makes the result more spread out, not less.
When is the discrete uniform distribution an appropriate model?
When a finite list of outcomes are all equally likely, e.g. a fair die or fair spinner.
For X uniform on {1, 2, …, n}, what is E(X)?
E(X) = (n + 1)/2.
For X uniform on {1, 2, …, n}, what is Var(X)?
Var(X) = (n² − 1)/12.
For X uniform on {a, a+1, …, b}, how do you find E(X)?
E(X) = (a + b)/2 (the average of the endpoints).
What are the conditions for a binomial distribution X ~ B(n, p)?
Fixed number n of independent trials, each with two outcomes (success/failure), and a constant probability p of success in each trial.
Notation X ~ B(n, p) means…
X is binomial with n trials and probability p of success per trial.
For X ~ B(n, p), what is E(X)?
E(X) = np.
For X ~ B(n, p), what is Var(X)?
Var(X) = np(1 − p) = npq.
How is the mean of a binomial derived from Bernoulli trials?
Write X = X₁ + X₂ + … + Xₙ where Xᵢ is Bernoulli(p). Then E(Xᵢ) = p, so E(X) = np by linearity.
How is the variance of a binomial derived?
Each Xᵢ has Var(Xᵢ) = p(1 − p). Since the Xᵢ are independent, Var(X) = Σ Var(Xᵢ) = np(1 − p).
When is the Poisson distribution an appropriate model?
When events occur randomly, independently, at a constant average rate over a fixed interval of time or space.
Notation X ~ Po(λ) means…
X is Poisson distributed with mean rate λ events per interval.
For X ~ Po(λ), what is the probability function?
P(X = r) = e^(−λ) · λ^r / r!, for r = 0, 1, 2, …
For X ~ Po(λ), what is E(X)?
E(X) = λ.
For X ~ Po(λ), what is Var(X)?
Var(X) = λ. The mean equals the variance.
Quick test for whether Poisson might fit data
Check whether the sample mean and sample variance are approximately equal. If they differ a lot, Poisson is unlikely to be suitable.
If X ~ Po(λ) and Y ~ Po(μ) are independent, what is X + Y?
X + Y ~ Po(λ + μ). Independent Poissons sum to a Poisson with parameters added.
When are the binomial and Poisson both reasonable models?
When n is large and p is small in a binomial setting, the Poisson with λ = np approximates the binomial. With modern calculators, the binomial is usually used directly.
When is the geometric distribution an appropriate model?
For the number of independent Bernoulli trials needed to get the first success, each with probability p of success.
Notation X ~ Geo(p) means…
X is geometric with success probability p; X counts trials up to AND including the first success.
For X ~ Geo(p), what is P(X = r)?
P(X = r) = (1 − p)^(r − 1) · p, for r = 1, 2, 3, … (the first (r−1) trials fail, then one succeeds).
For X ~ Geo(p), what is P(X > r)?
P(X > r) = (1 − p)^r — the first r trials are all failures.
For X ~ Geo(p), what is E(X)?
E(X) = 1/p.
For X ~ Geo(p), what is Var(X)?
Var(X) = (1 − p)/p².
Interpret E(X) = 1/p for a geometric distribution
On average, it takes 1/p trials to get the first success — so smaller p means more trials needed.
What is the alternative geometric definition (excluded from Y432)?
One where X counts the number of FAILURES before the first success — not used at OCR MEI A-level; trials-to-first-success is the convention.
How would you decide between Poisson and binomial in context?
If you have a fixed number of trials with constant success probability, use binomial. If you have a count of events in a continuous interval with no fixed n, use Poisson.
How do you find the mode of a discrete distribution from a table?
The value(s) of x with the highest probability P(X = x).
What does "linear combination of random variables" mean?
An expression like aX + bY (or more generally a₁X₁ + a₂X₂ + … + aₙXₙ) where the aᵢ are constants.
Why does Var(2X) = 4Var(X), not 2Var(X)?
Doubling every outcome doubles deviations from the mean, and variance squares deviations: (2·deviation)² = 4·(deviation)².
For independent X and Y, what is E(XY)?
E(XY) = E(X)·E(Y) — but only when X and Y are independent.
What proofs are examinable in the variance results?
Proofs of E(aX + b) = aE(X) + b and Var(aX + b) = b²Var(X) using definitions; results for sums/differences of independent variables can be quoted.