Week 13: Central Limit Theorem and Moment-Generating Functions

Weak Law of Large Numbers

  • Setup: Let $X1, X2,
    ..\, Xi, \dots$ be a sequence of i.i.d. random variables with E(X</em>i)=μ\mathbb{E}(X</em>i) = \mu and Var(Xi)=σ2.\mathrm{Var}(X_i) = \sigma^2.
  • Define the sample mean:X<em>n=1n</em>i=1nXi.X<em>n = \frac{1}{n} \sum</em>{i=1}^n X_i.
  • Weak law statement: For any $\varepsilon > 0$,
    \lim{n\to\infty} \Pr{|Xn - \mu| > \varepsilon} = 0.
  • Proof sketch (Chebyshev): Since $\mathbb{E}(Xn) = \mu$ and Var(X</em>n)=σ2n,\mathrm{Var}(X</em>n) = \frac{\sigma^2}{n},
    Chebyshev’s inequality gives
    PrX<em>nμ>εVar(X</em>n)ε2=σ2nε2n0.\Pr{|X<em>n - \mu| > \varepsilon} \le \frac{\mathrm{Var}(X</em>n)}{\varepsilon^2} = \frac{\sigma^2}{n\varepsilon^2} \xrightarrow{n\to\infty} 0.

Strong Law of Large Numbers

  • Setup: Let $X1, X2, \dots$ be i.i.d. with E(X<em>i)=μ\mathbb{E}(X<em>i) = \mu and Var(X</em>i)=σ2.\mathrm{Var}(X</em>i) = \sigma^2.
  • Define $Xn = \frac{1}{n} \sum{i=1}^n X_i.$
  • Strong law statement:
    Pr(lim<em>nX</em>n=μ)=1.\Pr\left( \lim<em>{n\to\infty} X</em>n = \mu \right) = 1.
  • Interpretation: Large deviations from the mean become impossible almost surely as $n$ grows; a stronger claim than WLLN.

Central Limit Theorem (CLT): introduction

  • Setup: $X1, X2, \dots$ i.i.d. with mean $\mu$ and variance $\sigma^2$; let $Sn = \sum{i=1}^n Xi$ and $Xn = S_n/n$.
  • Classical normalization:Z<em>n=S</em>nnμσn.Z<em>n = \frac{S</em>n - n\mu}{\sigma\sqrt{n}}.
  • CLT: $Zn$ converges in distribution to the standard normal $N(0,1)$: \lim{n\to\infty} \Pr\left(\frac{S_n - n\mu}{\sigma\sqrt{n}} \le x\right) = \Phi(x), \qquad -\infty < x < \infty.

Central Limit Theorem (mgf form)

  • With $M$ the mgf of $X$ existing in a neighbourhood of $0$, CLT can also be written as
    lim<em>nPr(S</em>nnμxσn)=Φ(x).\lim<em>{n\to\infty} \Pr\left(S</em>n - n\mu \le x\, \sigma\sqrt{n}\right) = \Phi(x).
  • (mgf condition): The mgf $M_X(t) = \mathbb{E}(e^{tX})$ exists in a neighbourhood of $0$.

Moment-Generating Function (MGF)

  • Definition: The MGF of a random variable $X$ is
    MX(t)=E(etX).M_X(t) = \mathbb{E}(e^{tX}).
  • Existence: $M_X(t)$ may or may not exist for a given $t$; if it exists near $t=0$ it is useful for moments.
  • Discrete form: if $X$ takes values in a discrete set with pmf $p(x)$, then
    M<em>X(t)=</em>xetxp(x).M<em>X(t) = \sum</em>x e^{t x} p(x).
  • Continuous form: if $X$ has pdf $f(x)$, then
    M<em>X(t)=</em>etxf(x)dx.M<em>X(t) = \int</em>{-\infty}^{\infty} e^{t x} f(x) \, dx.

Properties of MGFs

  • The mgf, when it exists in a neighbourhood of $0$, uniquely determines the distribution.
  • Derivatives at zero give moments:
    M<em>X(0)=E(X),M</em>X(0)=E(X2),MX(r)(0)=E(Xr).M'<em>X(0) = \mathbb{E}(X), \quad M''</em>X(0) = \mathbb{E}(X^2), \quad M^{(r)}_X(0) = \mathbb{E}(X^r).
  • Transformation: If $Y = a + bX$, then
    M<em>Y(t)=eatM</em>X(bt).M<em>Y(t) = e^{a t} M</em>X(b t).

MGFs of common distributions

  • Binomial$(n,p)$:
    MX(t)=(pet+1p)n.M_X(t) = (p e^{t} + 1 - p)^n.

    • Moments: M<em>X(0)=np,M</em>X(0)=n(n1)p2+np.M'<em>X(0) = np,\, M''</em>X(0) = n(n-1)p^2 + np.
    • Variance: Var(X)=M<em>X(0)[M</em>X(0)]2=np(1p).\mathrm{Var}(X) = M''<em>X(0) - [M'</em>X(0)]^2 = np(1-p).
  • Poisson$(\lambda)$:
    MX(t)=eλ(et1).M_X(t) = e^{\lambda (e^{t} - 1)}.

    • Moments: M<em>X(0)=λ,M</em>X(0)=λ2+λ.M'<em>X(0) = \lambda,\, M''</em>X(0) = \lambda^2 + \lambda.
    • Variance: Var(X)=λ.\mathrm{Var}(X) = \lambda.
  • Exponential$(\lambda)$:
    M_X(t) = \frac{\lambda}{\lambda - t}, \quad t < \lambda.

    • Moments: M<em>X(0)=1λ,M</em>X(0)=2λ2.M'<em>X(0) = \frac{1}{\lambda},\quad M''</em>X(0) = \frac{2}{\lambda^2}.
    • Variance: Var(X)=1λ2.\mathrm{Var}(X) = \frac{1}{\lambda^2}.
  • Standard Normal$(0,1)$:
    MX(t)=et2/2.M_X(t) = e^{t^2/2}.

    • Moments: M<em>X(0)=0,M</em>X(0)=1.M'<em>X(0) = 0,\, M''</em>X(0) = 1.
    • Variance: Var(X)=1.\mathrm{Var}(X) = 1.

MGF for general Normal distribution

  • If $X \sim N(\mu, \sigma^2)$, then
    MX(t)=eμt+12σ2t2.M_X(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2}.
  • Derivatives:
    M<em>X(t)=(μ+σ2t)eμt+12σ2t2,M'<em>X(t) = (\mu + \sigma^2 t) e^{\mu t + \frac{1}{2}\sigma^2 t^2},M</em>X(t)=σ2eμt+12σ2t2+(μ+σ2t)2eμt+12σ2t2.M''</em>X(t) = \sigma^2 e^{\mu t + \frac{1}{2}\sigma^2 t^2} + (\mu + \sigma^2 t)^2 e^{\mu t + \frac{1}{2}\sigma^2 t^2}.
  • Moments: E(X)=M<em>X(0)=μ,E(X2)=M</em>X(0)=μ2+σ2.\mathbb{E}(X) = M'<em>X(0) = \mu, \quad \mathbb{E}(X^2) = M''</em>X(0) = \mu^2 + \sigma^2.
  • Variance: Var(X)=σ2.\mathrm{Var}(X) = \sigma^2.

Basic normal transformation result

  • If $Y = a + bX$ with $X \sim N(\mu, \sigma^2)$, then $Y \sim N(a + b\mu, b^2 \sigma^2)$ as implied by mgf: $MY(t) = e^{a t} MX(bt)$.

Sum of independent random variables (MGF property)

  • Theorem: If $X$ and $Y$ are independent with mgfs $MX$ and $MY$ and $Z = X + Y$, then
    M<em>Z(t)=M</em>X(t)MY(t),M<em>Z(t) = M</em>X(t) M_Y(t),
    on the common interval where both mgfs exist.
  • Sketch of proof: M<em>Z(t)=E(et(X+Y))=E(etXetY)=E(etX)E(etY)=M</em>X(t)MY(t),M<em>Z(t) = \mathbb{E}(e^{t(X+Y)}) = \mathbb{E}(e^{tX} e^{tY}) = \mathbb{E}(e^{tX}) \mathbb{E}(e^{tY}) = M</em>X(t) M_Y(t),
    using independence.

Examples: Sum of independent Poisson and Normal distributions

  • Poisson + Poisson: If $X\sim\text{Poisson}(\lambda)$ and $Y\sim\text{Poisson}(\mu)$ are independent, then $X+Y \sim \text{Poisson}(\lambda+\mu)$.

    • MGFs: M<em>X(t)=eλ(et1),M</em>Y(t)=eμ(et1).M<em>X(t) = e^{\lambda (e^{t}-1)},\quad M</em>Y(t) = e^{\mu (e^{t}-1)}.
    • M<em>X+Y(t)=M</em>X(t)MY(t)=e(λ+μ)(et1).M<em>{X+Y}(t) = M</em>X(t) M_Y(t) = e^{(\lambda+\mu)(e^{t}-1)}.
  • Normal + Normal: If $X \sim N(\mu, \sigma^2)$ and $Y \sim N(\nu, \tau^2)$ are independent, then
    X+YN(μ+ν,σ2+τ2).X+Y \sim N(\mu+\nu, \sigma^2+\tau^2).

    • MGFs: M<em>X(t)=eμt+12σ2t2,M</em>Y(t)=eνt+12τ2t2,M<em>X(t) = e^{\mu t + \frac{1}{2}\sigma^2 t^2},\quad M</em>Y(t) = e^{\nu t + \frac{1}{2}\tau^2 t^2},
    • M<em>X+Y(t)=M</em>X(t)MY(t)=e(μ+ν)t+12(σ2+τ2)t2.M<em>{X+Y}(t) = M</em>X(t) M_Y(t) = e^{(\mu+\nu)t + \frac{1}{2}(\sigma^2+\tau^2)t^2}.

Remarks about mgfs

  • Joint mgf: For random variables $(X,Y)$, the joint mgf is
    MX,Y(s,t)=E(esX+tY).M_{X,Y}(s,t) = \mathbb{E}(e^{sX + tY}).
  • Independence criterion via mgfs: $X$ and $Y$ are independent iff M<em>X,Y(s,t)=M</em>X(s)MY(t)M<em>{X,Y}(s,t) = M</em>X(s) M_Y(t) for all $(s,t)$ in a neighbourhood of $(0,0)$.
  • Limitation: The mgf may not exist for all distributions or all $t$; this limits applicability.

Central Limit Theorem (recall) and mgf assumptions

  • Recall: If $X1, X2, \dots$ are i.i.d. with mean $\mu$ and variance $\sigma^2$, and mgf exists in a neighbourhood of zero, then with $Sn = \sum{i=1}^n Xi$, \lim{n\to\infty} \Pr\left( \frac{S_n - n\mu}{\sigma\sqrt{n}} \le x \right) = \Phi(x), \quad -\infty < x < \infty.
  • The mgf existence assumption is a strong one; there are many versions of the CLT.

Continuity Theorem (convergence of mgfs implies distribution convergence)

  • Let $Fn$ be CDFs with mgfs $Mn$, and let $F$ be a CDF with mgf $M$. If $Mn(t) \to M(t)$ for all $t$ in an open interval containing 0, then $Fn(x) \to F(x)$ at all continuity points of $F$.
  • Application: Since the CDF of $N(0,1)$ is continuous, it suffices to show that the mgf of $S_n$ converges to the mgf of $N(0,1)$.

Proof sketch of the CLT via mgf (standardized version)

  • Aim: Prove the standard CLT for μ = 0 and σ = 1.
  • Define $Zn = Sn / \sqrt{n}$. Since $Sn$ is a sum of independent variables, its mgf is M</em>S<em>n(t)=[M</em>X(t)]n.M</em>{S<em>n}(t) = [M</em>X(t)]^n.
  • For $Zn$, the mgf is M</em>Z<em>n(t)=M</em>X(tn)n.M</em>{Z<em>n}(t) = M</em>X\left(\frac{t}{\sqrt{n}}\right)^n.
  • Expand $MX(s)$ around $s=0$: M</em>X(s)=M<em>X(0)+sM</em>X(0)+12s2MX(0)+o(s2)M</em>X(s) = M<em>X(0) + s M'</em>X(0) + \frac{1}{2} s^2 M''_X(0) + o(s^2)
    as $s \to 0$.
  • With $\mathbb{E}(X) = 0$, we have $M'X(0) = 0$, and with standardization $M''X(0) = 1$; hence
    MX(s)=1+12s2+o(s2).M_X(s) = 1 + \frac{1}{2} s^2 + o(s^2).
  • Substitute $s = t/\sqrt{n}$:
    M<em>Z</em>n(t)=(1+t22n+o(1/n))nnet2/2.M<em>{Z</em>n}(t) = \left( 1 + \frac{t^2}{2n} + o(1/n) \right)^n \xrightarrow{n\to\infty} e^{t^2/2}.
  • The limit mgf $e^{t^2/2}$ is the mgf of $N(0,1)$, so $Z_n$ converges in distribution to $N(0,1)$.

Practical takeaways and exam relevance

  • The CLT explains why sums of many iid, moderate-variance variables tend to look normal, regardless of the original distribution, once properly centered and scaled.
  • MGFs provide a powerful, compact way to compute moments and to prove distributional convergence via the Continuity Theorem.
  • The mgf approach also yields closed-form results for sums of independent Poisson and normal variables, and shows how transformation and independence interact in distributional properties.

Endnotes from the slides

  • The slides emphasize that the mgf method is a tool for theory and for later courses; some mgf topics are not covered on the final exam.
  • The presented versions of CLT use mgf existence as a condition and acknowledge multiple versions exist.
  • The Continuity Theorem connects mgf convergence to distribution convergence and is a key step in mgf-based proofs of CLT.

Source alignment (slide references):

  • WLLN and proof via Chebyshev (Page 2)
  • SLLN statement (Page 3)
  • CLT normalization and statement (Pages 4–5)
  • CLT examples with dice and uniform variables (Pages 6–7)
  • MGFs: definition and basic properties (Pages 8–10)
  • MGFs of Binomial, Poisson, Exponential, Normal (Pages 11–20)
  • Sum of independent variables and examples (Pages 21–23)
  • Remarks on mgf, joint mgf, and independence (Page 24)
  • CLT recall, Continuity Theorem, and mgf-based proofs (Pages 25–28)
  • Final remarks about mgf applications and exam scope (Pages 29–31)