Week 13: Central Limit Theorem and Moment-Generating Functions

Weak Law of Large Numbers

Setup: Let $X1, X2,
..\, Xi, \dots$ be a sequence of i.i.d. random variables with $\mathbb{E}(Xi) = \mu$ and $\mathrm{Var}(X_i) = \sigma^2.$
Define the sample mean: $Xn = \frac{1}{n} \sum{i=1}^n X_i.$
Weak law statement: For any $\varepsilon > 0$,
\lim{n\to\infty} \Pr{|Xn - \mu| > \varepsilon} = 0.
Proof sketch (Chebyshev): Since $\mathbb{E}(Xn) = \mu$ and $\mathrm{Var}(Xn) = \frac{\sigma^2}{n},$
Chebyshev’s inequality gives
$\Pr{|Xn - \mu| > \varepsilon} \le \frac{\mathrm{Var}(Xn)}{\varepsilon^2} = \frac{\sigma^2}{n\varepsilon^2} \xrightarrow{n\to\infty} 0.$

Strong Law of Large Numbers

Setup: Let $X1, X2, \dots$ be i.i.d. with $\mathbb{E}(Xi) = \mu$ and $\mathrm{Var}(Xi) = \sigma^2.$
Define $Xn = \frac{1}{n} \sum{i=1}^n X_i.$
Strong law statement:
$\Pr\left( \lim{n\to\infty} Xn = \mu \right) = 1.$
Interpretation: Large deviations from the mean become impossible almost surely as $n$ grows; a stronger claim than WLLN.

Central Limit Theorem (CLT): introduction

Setup: $X1, X2, \dots$ i.i.d. with mean $\mu$ and variance $\sigma^2$; let $Sn = \sum{i=1}^n Xi$ and $Xn = S_n/n$.
Classical normalization: $Zn = \frac{Sn - n\mu}{\sigma\sqrt{n}}.$
CLT: $Zn$ converges in distribution to the standard normal $N(0,1)$: \lim{n\to\infty} \Pr\left(\frac{S_n - n\mu}{\sigma\sqrt{n}} \le x\right) = \Phi(x), \qquad -\infty < x < \infty.

Central Limit Theorem (mgf form)

With $M$ the mgf of $X$ existing in a neighbourhood of $0$, CLT can also be written as
$\lim{n\to\infty} \Pr\left(Sn - n\mu \le x\, \sigma\sqrt{n}\right) = \Phi(x).$
(mgf condition): The mgf $M_X(t) = \mathbb{E}(e^{tX})$ exists in a neighbourhood of $0$.

Moment-Generating Function (MGF)

Definition: The MGF of a random variable $X$ is
$M_X(t) = \mathbb{E}(e^{tX}).$
Existence: $M_X(t)$ may or may not exist for a given $t$; if it exists near $t=0$ it is useful for moments.
Discrete form: if $X$ takes values in a discrete set with pmf $p(x)$, then
$MX(t) = \sumx e^{t x} p(x).$
Continuous form: if $X$ has pdf $f(x)$, then
$MX(t) = \int{-\infty}^{\infty} e^{t x} f(x) \, dx.$

Properties of MGFs

The mgf, when it exists in a neighbourhood of $0$, uniquely determines the distribution.
Derivatives at zero give moments:
$M'X(0) = \mathbb{E}(X), \quad M''X(0) = \mathbb{E}(X^2), \quad M^{(r)}_X(0) = \mathbb{E}(X^r).$
Transformation: If $Y = a + bX$, then
$MY(t) = e^{a t} MX(b t).$

MGFs of common distributions

Binomial$(n,p)$:
$M_X(t) = (p e^{t} + 1 - p)^n.$
- Moments: $M'X(0) = np,\, M''X(0) = n(n-1)p^2 + np.$
- Variance: $\mathrm{Var}(X) = M''X(0) - [M'X(0)]^2 = np(1-p).$
Poisson$(\lambda)$:
$M_X(t) = e^{\lambda (e^{t} - 1)}.$
- Moments: $M'X(0) = \lambda,\, M''X(0) = \lambda^2 + \lambda.$
- Variance: $\mathrm{Var}(X) = \lambda.$
Exponential$(\lambda)$:
M_X(t) = \frac{\lambda}{\lambda - t}, \quad t < \lambda.
- Moments: $M'X(0) = \frac{1}{\lambda},\quad M''X(0) = \frac{2}{\lambda^2}.$
- Variance: $\mathrm{Var}(X) = \frac{1}{\lambda^2}.$
Standard Normal$(0,1)$:
$M_X(t) = e^{t^2/2}.$
- Moments: $M'X(0) = 0,\, M''X(0) = 1.$
- Variance: $\mathrm{Var}(X) = 1.$

MGF for general Normal distribution

If $X \sim N(\mu, \sigma^2)$, then
$M_X(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2}.$
Derivatives:
$M'X(t) = (\mu + \sigma^2 t) e^{\mu t + \frac{1}{2}\sigma^2 t^2},$ $M''X(t) = \sigma^2 e^{\mu t + \frac{1}{2}\sigma^2 t^2} + (\mu + \sigma^2 t)^2 e^{\mu t + \frac{1}{2}\sigma^2 t^2}.$
Moments: $\mathbb{E}(X) = M'X(0) = \mu, \quad \mathbb{E}(X^2) = M''X(0) = \mu^2 + \sigma^2.$
Variance: $\mathrm{Var}(X) = \sigma^2.$

Basic normal transformation result

If $Y = a + bX$ with $X \sim N(\mu, \sigma^2)$, then $Y \sim N(a + b\mu, b^2 \sigma^2)$ as implied by mgf: $MY(t) = e^{a t} MX(bt)$.

Sum of independent random variables (MGF property)

Theorem: If $X$ and $Y$ are independent with mgfs $MX$ and $MY$ and $Z = X + Y$, then
$MZ(t) = MX(t) M_Y(t),$
on the common interval where both mgfs exist.
Sketch of proof: $MZ(t) = \mathbb{E}(e^{t(X+Y)}) = \mathbb{E}(e^{tX} e^{tY}) = \mathbb{E}(e^{tX}) \mathbb{E}(e^{tY}) = MX(t) M_Y(t),$
using independence.

Examples: Sum of independent Poisson and Normal distributions

Poisson + Poisson: If $X\sim\text{Poisson}(\lambda)$ and $Y\sim\text{Poisson}(\mu)$ are independent, then $X+Y \sim \text{Poisson}(\lambda+\mu)$.
- MGFs: $MX(t) = e^{\lambda (e^{t}-1)},\quad MY(t) = e^{\mu (e^{t}-1)}.$
- $M{X+Y}(t) = MX(t) M_Y(t) = e^{(\lambda+\mu)(e^{t}-1)}.$
Normal + Normal: If $X \sim N(\mu, \sigma^2)$ and $Y \sim N(\nu, \tau^2)$ are independent, then
$X+Y \sim N(\mu+\nu, \sigma^2+\tau^2).$
- MGFs: $MX(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2},\quad MY(t) = e^{\nu t + \frac{1}{2}\tau^2 t^2},$
- $M{X+Y}(t) = MX(t) M_Y(t) = e^{(\mu+\nu)t + \frac{1}{2}(\sigma^2+\tau^2)t^2}.$

Remarks about mgfs

Joint mgf: For random variables $(X,Y)$, the joint mgf is
$M_{X,Y}(s,t) = \mathbb{E}(e^{sX + tY}).$
Independence criterion via mgfs: $X$ and $Y$ are independent iff $M{X,Y}(s,t) = MX(s) M_Y(t)$ for all $(s,t)$ in a neighbourhood of $(0,0)$.
Limitation: The mgf may not exist for all distributions or all $t$; this limits applicability.

Central Limit Theorem (recall) and mgf assumptions

Recall: If $X1, X2, \dots$ are i.i.d. with mean $\mu$ and variance $\sigma^2$, and mgf exists in a neighbourhood of zero, then with $Sn = \sum{i=1}^n Xi$, \lim{n\to\infty} \Pr\left( \frac{S_n - n\mu}{\sigma\sqrt{n}} \le x \right) = \Phi(x), \quad -\infty < x < \infty.
The mgf existence assumption is a strong one; there are many versions of the CLT.

Continuity Theorem (convergence of mgfs implies distribution convergence)

Let $Fn$ be CDFs with mgfs $Mn$, and let $F$ be a CDF with mgf $M$. If $Mn(t) \to M(t)$ for all $t$ in an open interval containing 0, then $Fn(x) \to F(x)$ at all continuity points of $F$.
Application: Since the CDF of $N(0,1)$ is continuous, it suffices to show that the mgf of $S_n$ converges to the mgf of $N(0,1)$.

Proof sketch of the CLT via mgf (standardized version)

Aim: Prove the standard CLT for μ = 0 and σ = 1.
Define $Zn = Sn / \sqrt{n}$. Since $Sn$ is a sum of independent variables, its mgf is $M{Sn}(t) = [MX(t)]^n.$
For $Zn$, the mgf is $M{Zn}(t) = MX\left(\frac{t}{\sqrt{n}}\right)^n.$
Expand $MX(s)$ around $s=0$: $MX(s) = MX(0) + s M'X(0) + \frac{1}{2} s^2 M''_X(0) + o(s^2)$
as $s \to 0$.
With $\mathbb{E}(X) = 0$, we have $M'X(0) = 0$, and with standardization $M''X(0) = 1$; hence
$M_X(s) = 1 + \frac{1}{2} s^2 + o(s^2).$
Substitute $s = t/\sqrt{n}$:
$M{Zn}(t) = \left( 1 + \frac{t^2}{2n} + o(1/n) \right)^n \xrightarrow{n\to\infty} e^{t^2/2}.$
The limit mgf $e^{t^2/2}$ is the mgf of $N(0,1)$, so $Z_n$ converges in distribution to $N(0,1)$.

Practical takeaways and exam relevance

The CLT explains why sums of many iid, moderate-variance variables tend to look normal, regardless of the original distribution, once properly centered and scaled.
MGFs provide a powerful, compact way to compute moments and to prove distributional convergence via the Continuity Theorem.
The mgf approach also yields closed-form results for sums of independent Poisson and normal variables, and shows how transformation and independence interact in distributional properties.

Endnotes from the slides

The slides emphasize that the mgf method is a tool for theory and for later courses; some mgf topics are not covered on the final exam.
The presented versions of CLT use mgf existence as a condition and acknowledge multiple versions exist.
The Continuity Theorem connects mgf convergence to distribution convergence and is a key step in mgf-based proofs of CLT.

Source alignment (slide references):

WLLN and proof via Chebyshev (Page 2)
SLLN statement (Page 3)
CLT normalization and statement (Pages 4–5)
CLT examples with dice and uniform variables (Pages 6–7)
MGFs: definition and basic properties (Pages 8–10)
MGFs of Binomial, Poisson, Exponential, Normal (Pages 11–20)
Sum of independent variables and examples (Pages 21–23)
Remarks on mgf, joint mgf, and independence (Page 24)
CLT recall, Continuity Theorem, and mgf-based proofs (Pages 25–28)
Final remarks about mgf applications and exam scope (Pages 29–31)