Probability Bounds and Central Limit Theorems Study Notes

Probability and Stochastic Processes

Unit III: Probability Bounds and Central Limit Theorems

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203

Introduction

  • This unit discusses probability bounds, which are inequalities applicable in various scenarios.

    • Purpose of using inequalities:

    • Insufficient information to calculate a desired quantity (e.g., probability of an event, expected value of a random variable).

    • Complicated problems where exact calculations are difficult.

    • Provide general results applicable to a wide range of problems.

  • In this section, µ and σ² represent the mean and variance of the random variable under consideration.


Markov’s Inequality

General Theme
  • Focus on upper bounding tail probabilities, i.e., probabilities of the forms

    • P(Xcµ)P(X \, \geq cµ)

    • P(Xcµ)P(X \, \leq cµ).

Definition
  • Markov’s Inequality:

    • Let X be a non-negative random variable.

    • Then, for any a > 0 :
      P(Xa)E(X)aP(X \, \geq a) \leq \frac{E(X)}{a}.

Proof of Markov’s Inequality

Discrete Random Variable

  1. Assume X is a discrete random variable.

  2. The expected value is defined as: E[X]=xxP(X=x)E[X] = \sum_x xP(X = x)

    • This can be split into two parts:
      \sum_{x<a} xP(X=x) + \sum_{x\geq a} xP(X=x).

  3. Since X is non-negative,
    xaxP(X=x)axaP(X=x).\sum_{x\geq a} xP(X=x) \geq a \sum_{x\geq a} P(X=x).

  4. Rearranging yields:
    P(Xa)E(X)a.P(X \geq a) \leq \frac{E(X)}{a}.

Continuous Random Variable

  1. Assume X is a continuous random variable.

  2. The expected value is defined as:
    E(X)=xf(x)dx=0xf(x)dx(x0).E(X) = \int_{-\infty}^{\infty} xf(x)dx = \int_{0}^{\infty} xf(x)dx \, (x \geq 0).

  3. The inequality is similarly derived:
    P(Xa)E(X)a.P(X \geq a) \leq \frac{E(X)}{a}.


Examples of Markov’s Inequality

Example 1
  • Let XB(n,p)X \sim B(n, p).

  • Apply Markov's inequality to find an upper bound on P(Xαn)P(X \geq \alpha n), where p < \alpha < 1 .

  • Solution:

    • E(X)=npE(X) = np.

    • P(Xαn)npαn=pα.P(X \geq \alpha n) \leq \frac{np}{\alpha n} = \frac{p}{\alpha}.

    • With p=12p = \frac{1}{2} and α=34\alpha = \frac{3}{4}, it follows that
      P(X3n4)23.P(X \geq \frac{3n}{4}) \leq \frac{2}{3}.

Example 2
  • Let XE(λ)X \sim E(\lambda).

  • Determine an upper bound for P(Xa)P(X \geq a), where a > 0 :

    • The p.d.f. of XX is given by:
      f(x)=λeλx for x0f(x) = \lambda e^{-\lambda x} \text{ for } x \geq 0.

    • E(X)=1λE(X) = \frac{1}{\lambda}.

    • Thus, using Markov's inequality:
      P(Xa)1/λa=1λa.P(X \geq a) \leq \frac{1/\lambda}{a} = \frac{1}{\lambda a}.

  • The actual value is given by:
    P(Xa)=aλeλxdx=eλa.P(X \geq a) = \int_{a}^{\infty} \lambda e^{-\lambda x}dx = e^{-\lambda a}.


Chebyshev’s Inequality

Definition
  • Chebyshev’s Inequality:

    • If X is a random variable with

    • Mean E(X)=μE(X) = \mu and

    • Variance Var(X)=σ2Var(X) = \sigma^2,

    • Then:
      P(Xμa)σ2a2P(|X − \mu| \geq a) \leq \frac{\sigma^2}{a^2},

    • or:
      P(|X − \mu| < a) \geq 1 - \frac{\sigma^2}{a^2} for any a > 0 .

Explanation
  • Chebyshev’s Inequality states that the probability of the difference between X and E(X)E(X) is limited by the variance of X.

Alternative Forms
  • If we let a=kσa = kσ where k > 0 then:

    • P(Xμkσ)1k2.P(|X − \mu| \geq kσ) \leq \frac{1}{k^2}.

    • Conversely:
      P(|X − \mu| < kσ) \geq 1 - \frac{1}{k^2}.

Example 3
  1. Show by Chebyshev’s inequality that for 2000 throws of a coin, the probability that the number of heads lies between 900 and 1100 is at least 1920\frac{19}{20}:

    • Let X denote the number of heads in 2000 throws of a coin:

    • E(X)=2000×12=1000E(X) = 2000 \times \frac{1}{2} = 1000,

    • Var(X)=2000×12×12=500Var(X) = 2000 \times \frac{1}{2} \times \frac{1}{2} = 500.

    • Therefore,
      P(900 < X < 1100) = P(-100 < X − 1000 < 100) = 1 - P(|X − 1000| \geq 100).

    • By Chebyshev's inequality:
      P(X1000100)5001002=120.P(|X − 1000| \geq 100) \leq \frac{500}{100^2} = \frac{1}{20}.

    • Thus,
      P(900 < X < 1100) \geq 1 - \frac{1}{20} = \frac{19}{20}.

Example 4
  • For a random variable X with density function: f(x)={ex,amp;x0 0,amp;elsewheref(x) = \begin{cases} e^{-x}, &amp; x \geq 0 \ 0, &amp; \text{elsewhere} \end{cases}.

    • Show that Chebyshev's inequality provides:

    • P(X12)14P(|X - 1| \geq 2) \leq \frac{1}{4}, while the actual probability is e3e^{-3}:

  1. Calculate the mean:

    • μ=E(X)=0xexdx=1\mu = E(X) = \int_{0}^{\infty} xe^{-x}dx = 1.

  2. Calculate the second moment:

    • E(X2)=0x2exdx=2E(X^2) = \int_{0}^{\infty} x^2 e^{-x}dx = 2

  3. Therefore,

    • Var(X)=E(X2)μ2=21=1Var(X) = E(X^2) - \mu^2 = 2 - 1 = 1.

  4. Using Chebyshev’s inequality for any a > 0 :

    • P(Xμa)1a2.P(|X - \mu| \geq a) \leq \frac{1}{a^2}.

  5. Taking a=2a = 2:

    • P(X12)14.P(|X - 1| \geq 2) \leq \frac{1}{4}.

  6. Calculating the actual probability:


    • P(|X - 1| \geq 2) = 1 - P(-1 < X < 3) = 1 - \int_{0}^{3} e^{-x}dx = 1 - (1 - e^{-3}) = e^{-3}.


Additional Concepts

Laws of Large Numbers

Convergence in Probability

  • A sequence of random variables X1,X2,ext,XnX_1, X_2, ext{…}, X_n converges in probability to α\alpha, if:

    • For every \epsilon > 0 ,
      \lim_{n \to \infty} P(|X_n - \alpha| < \epsilon) = 1

    • OR
      limnP(Xnαϵ)=0\lim_{n \to \infty} P(|X_n - \alpha| \geq \epsilon) = 0.

Weak Law of Large Numbers (WLLN)

  • Let X1,X2,,XnX_1, X_2, …, X_n be a sequence of random variables with respective means μ1,μ2,,μn\mu_1, \mu_2, …, \mu_n.

  • Let Xˉn=X1+X2++Xnn\bar{X}_n = \frac{X_1 + X_2 + \ldots + X_n}{n}

    • Then:

    • P(Xˉnμˉnϵ)p0P(|\bar{X}_n − \bar{\mu}_n| ≥ \epsilon) \xrightarrow{p} 0

    • provided:

    • limBnn20\lim \frac{B_n}{n^2} \rightarrow 0, where B_n = Var(X_1 + X_2 + … + X_n) < \infty .

Conditions for WLLN

  • To verify WLLN holds:
    (i) E(Xi)E(X_i) exists for all ii.
    (ii) Bn=Var(X1+ext+Xn)B_n = Var(X_1 + ext{…} + X_n) exists.

    (iii) limnBnn20\lim_{n \to \infty} \frac{B_n}{n^2} \rightarrow 0.

Example 9

  • Check if WLLN holds for the sequence Xi{X_i} defined by
    P(Xk=±2k)=2(2k+1);P(Xk=0)=122k.P(X_k = \pm 2^k) = 2^{-(2k+1)}; \, P(X_k = 0) = 1 - 2^{-2k}.

  • Calculate:

    • E(Xk)=XkP(Xk)=0E(X_k) = \sum X_k P(X_k) = 0

    • Var(Xk)=E(Xk2)=1Var(X_k) = E(X_k^2) = 1,

  • Thus,

  • B_n = Var(X_1 + … + X_n) = n < \infty .

  • Therefore,

    • limnBnn2=0WLLNholds\lim_{n \to \infty} \frac{B_n}{n^2} = 0 \Rightarrow WLLN holds.


Central Limit Theorem (CLT)

Definition
  • The Central Limit Theorem states:

    • The normal distribution is the limiting distribution of the sum of independent random variables with finite variance.

  • For n1n ≥ 1, let X1,X2,,XnX_1, X_2, …, X_n be independent, identically distributed random variables with finite mean μ\mu and variance σ2\sigma^2, then:

    • Sn=X1+X2++XnS_n = X_1 + X_2 + … + X_n

    • Xˉn=Snn\bar{X}_n = \frac{S_n}{n}

  • As nn → ∞:
    (a)
    P\left( \frac{S_n - n\mu}{\sqrt{n}\sigma} ≤ x \right) \rightarrow \frac{1}{\sqrt{2C0}} \int_{-\infty}^{x} e^{-\frac{t^2}{2}} dt, \text{ for all } x \in \mathbb{R}.
    (b)
    P(n(Xˉ<em>nμ)σx)123C0</em>xet22dt, for all xR.P\left( \frac{\sqrt{n}(\bar{X}<em>n - \mu)}{\sigma} ≤ x \right) \rightarrow \frac{1}{\sqrt{23C0}} \int</em>{-\infty}^{x} e^{-\frac{t^2}{2}} dt, \text{ for all } x \in \mathbb{R}.

  • In summary:

    • For large n:

    • SnextfollowsN(Nμ,nσ2)S_n ext{ follows } N(N\mu, n\sigma^2),

    • XˉnextfollowsN(μ,σ2n)\bar{X}_n ext{ follows } N(\mu, \frac{\sigma^2}{n}).

Remark on CLT Usage
  1. For iid random variables, if n is large enough,

    • SnS_n follows normal distribution with mean nextµn ext{µ} and variance nextσ2n ext{σ²};

    • Xˉn\bar{X}_n follows normal distribution with mean µµ and variance σ2n\frac{σ²}{n}.

Examples of CLT

Example 12

  • For a coin tossed 200 times, find the probability of number of heads between 80 and 120:

  1. Let XX = number of heads. Then XB(200,12)X \sim B(200, \frac{1}{2}).

  2. Thus,

    • Mean:
      μ=np=200×12=100\mu = np = 200 \times \frac{1}{2} = 100

    • Variance:
      σ2=np(1p)=200×12×12=50\sigma^2 = np(1-p) = 200 \times \frac{1}{2} \times \frac{1}{2} = 50.

  3. Apply CLT:

    • Find
      Z=X10050Z = \frac{X - 100}{\sqrt{50}} follows standard normal distribution.

  4. Therefore, the probability is:
    P(80 < X < 120) = P\left(-2.82 < Z < 2.82\right) = 0.9952 .

Example 13

  • Given 75 Poisson variates with parameter 2, estimate P(120 < S_n < 160) :

  1. For XiPoisson(2)X_i \sim Poisson(2),

    • E(Xi)=2E(X_i) = 2, Var(Xi)=2Var(X_i) = 2.

  2. By CLT:

    • SnextfollowsN(150,150)S_n ext{ follows } N(150, 150).

  3. The z-score transition leads to
    P(120 < S_n < 160) = P(-2.4495 ≤ Z ≤ 0.8165) = 0.7868.

Example 14
  • For IID random variables with
    μ=3 and Var(Xi)=12,μ = 3 \text{ and } Var(X_i) = \frac{1}{2}, estimate P(340Sn370)P(340 ≤ S_n ≤ 370) for n=120n = 120:

  1. Therefore,

    • SnN(360,60)S_n \sim N(360, 60).

  2. Proceed similarly as before for CDF calculations:

    • P(340Sn370)=P(2.582Z1.291)=0.8966P(340 ≤ S_n ≤ 370) = P(-2.582 ≤ Z ≤ 1.291) = 0.8966.

Example 15
  • For the lifetime of bulbs with

    • E(Xi)=1200exthrsE(Xi) = 1200 ext{ hrs}, σ=250exthrsσ = 250 ext{ hrs}, find:

    • Probability that average lifetime of 60 bulbs exceeds 1250 hrs:

  1. By CLT, XˉN(1200,250260)\bar{X} \sim N(1200,\frac{250^2}{60}).

  2. Required probability:

    • P(\bar{X} > 1250) = P\left( Z > \frac{1250 - 1200}{250/\sqrt{60}} \right) = P(Z > 1.55) = 0.0606.


Strong Law of Large Numbers

  • For independent, identically distributed random variables with a finite expected value $ E(X_i) = \mu < \infty

    • It holds with probability 1 that as nn → ∞, Xˉμ\bar{X} → \mu

    • i.e. P(limnXˉ=μ)=1P( lim_{n \to \infty} \bar{X} = \mu) = 1.

One-sided Chebyshev’s Inequality

Definition
  • For random variable X with mean 0 and finite variance $ σ^2

    • For any c > 0 :

    • P(Xc)σ2σ2+c2.P(X \geq c) \leq \frac{σ^2}{σ^2 + c^2}.

Example 16
  • Given items produced with mean 100 and variance 400, find P(X120)P(X ≥ 120):

  1. Using one-sided Chebyshev’s:

    • P(X120)400400+(20)2=12.P(X ≥ 120) \leq \frac{400}{400 + (20)^2} = \frac{1}{2}.

    • From Markov’s inequality, P(X120)100120=56P(X ≥ 120) \leq \frac{100}{120} = \frac{5}{6}.


Cauchy-Schwarz Inequality

Definition
  • If X and Y are two random variables such that E(X2)E(X^2), E(Y2)E(Y^2), and E(XY)E(XY) exist, then:

    • E(XY)2E(X2)E(Y2){E(XY)}^2 \leq E(X^2) E(Y^2).

    • Equality holds if and only if:

    • E(X2)=0E(X^2) = 0, or P(YaX=0)=1P(Y − aX = 0) = 1 for some constant a.

Example 17
  • Utilize Cauchy-Schwarz to prove:

- ρ(X,Y)1|ρ(X, Y)| ≤ 1, where ρ = correlation coefficient.

Integrable Random Variables

Definition
  • A random variable X is integrable if its expected value exists, i.e., E(X) < \infty .

Convex Function

Definition
  • A twice-differentiable function g(x)g(x) is convex if the second derivative exists and is non-negative within its domain, i.e., g(x)0g''(x) ≥ 0.

Jensen’s Inequality
  • Jensen’s Inequality:

    • For integrable random variable X and a convex function g:RRg : R → R,

    • Then:
      Eg(X)g(E(X)).E{g(X)} ≥ g(E(X)).

Example 18
  • Let X have mean 10, find:

    • E(1X+1).E\left( \frac{1}{X + 1} \right).

  1. Let g(x)=1x+1g(x) = \frac{1}{x + 1}

  2. The second derivative $g''(x) > 0$ for x > -1 implies convexity:

  3. Therefore:
    E(1X+1)1E(X)+1=111.E\left( \frac{1}{X + 1} \right) ≥ \frac{1}{E(X) + 1} = \frac{1}{11}.


Moment Generating Function

Definition
  • The moment generating function (M.G.F.) of a random variable X is defined as:
    MX(s)=E(esX).M_X (s) = E(e^{sX}).

Chernoff Bounds

Definition
  • For any random variable X and real number a,

    • We can express:
      P(X \geq a) = P(e^{tX} \geq e^{ta}), \text{ for } t > 0;
      P(X \leq a) = P(e^{tX} \geq e^{ta}), \text{ for } t < 0.

  • Apply Markov’s inequality:

    • For t > 0:
      P(Xa)E(etX)eta.P(X \geq a) \leq E(e^{tX}) e^{-ta}.

  • Similarly for t < 0 :
    P(Xa)E(etX)eta.P(X \leq a) \leq E(e^{tX}) e^{-ta}.

Conclusion
  • Thus, we establish:
    P(X \geq a) \leq ext{Min}{t>0} \left{ e^{-ta} M_X(t) \right}, ext{ for all } t > 0 P(X \leq a) \leq ext{Min}{t<0} \left{ e^{-ta} M_X(t) \right}, ext{ for all } t < 0.

Example 19
  • For a standard normal variable Z with its M.G.F.:
    MZ(t)=eract22.M_Z(t) = e^{ rac{t^2}{2}} .

  • Estimate P(Z2)P(Z ≥ 2) using Chernoff bounds:

  1. P(Z ≥ 2) ≤ ext{Min} \left{ e^{-2t} M_Z(t) \right} .

  2. Minimize the function:
    eract222t.e^{ rac{t^2}{2} - 2t}.

  3. Take derivative to identify critical points.

  4. Get best upper bound P(Z2)e2=0.1353.P(Z ≥ 2) ≤ e^{-2} = 0.1353.

  • Comparing with one-sided Chebyshev’s:
    P(Z2)15.P(Z ≥ 2) ≤ \frac{1}{5}.

Example 20
  • For a Poisson variate XX with parameter extλ=2ext{λ} = 2:

  1. Calculate P(X3)P(X ≥ 3) using Chernoff bounds:
    P(X ≥ 3) ≤ ext{Min} \left{ e^{-3t} M_X(t) \right}.

  2. Establish CDF and compare with Markov's Inequality.

  3. Construct the minimizing function and identify critical points accordingly.


Thank You
  • Session concluded with acknowledgments from

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203.

Example 5

  • Let XextbearandomvariablewithE(X)=10X ext{ be a random variable with } E(X) = 10.

  • Find an upper bound for P(Xextextisextatextleast20)P(X ext{ } ext{is } ext{ at } ext{ least 20}):

    • Using Markov's inequality:
      P(X ext{ } ext{is } ext{ at } ext{ least } 20) iggr ext{ } \ ext{ }iggr ext{ } ext{ }iggr ext{ } ext{ } ext{ }iggr ext{ }iggr ext{ } \ ext{ } ext{ }iggr ext{ }iggr ext{ } \ ext{ } ext{ } iggr ext{ }iggr ext{ } \ ext{ } = rac{E(X)}{20} = rac{10}{20} = rac{1}{2}.

Example 6

  • Let Y ext{ be a random variable that takes values } 0 ext{ and } 3withprobabilities0.5and0.5,respectively.</p></li><li><p>Calculateanupperboundforwith probabilities 0.5 and 0.5, respectively.</p></li><li><p>Calculate an upper bound for P(Y = 3) :</p><ul><li><p>Here,<br>:</p><ul><li><p>Here,<br> E(Y) = 0.5 imes 0 + 0.5 imes 3 = 1.5.</p></li><li><p>So,applyingMarkovsinequality:<br></p></li><li><p>So, applying Markov's inequality:<br> P(Y ext{ } ext{is } ext{ at } for ext{ least } 3) = rac{1.5}{3} = 0.5.</p></li></ul></li></ul><p>Example7</p><ul><li><p>Assume</p></li></ul></li></ul><p>Example 7</p><ul><li><p>Assume Z ext{ follows a Uniform distribution on } [0, 10] .</p></li><li><p>Thus,.</p></li><li><p>Thus, E(Z) = rac{0 + 10}{2} = 5 .</p></li><li><p>Findanupperboundfor.</p></li><li><p>Find an upper bound for P(Z ext{ > } 8) :</p><ul><li><p>UsingMarkovsinequality,<br>:</p><ul><li><p>Using Markov's inequality, <br> P(Z ext{ > } 8) iggr ext{ }iggr ext{ } = ext{ } ext{ at } \ ext{ } iggr ext{ }iggr ext{ } = rac{5}{8}. </p></li></ul></li></ul><p>Example8</p><ul><li><p>Forarandomvariable</p></li></ul></li></ul><p>Example 8</p><ul><li><p>For a random variable W withexpectedvaluewith expected value E(W) = 4 ,findanupperboundfor, find an upper bound for P(W > 5) :</p><ul><li><p>ApplyingMarkovsinequalitygives:<br>:</p><ul><li><p>Applying Markov’s inequality gives:<br> P(W > 5) iggr ext{ } = rac{E(W)}{5} = rac{4}{5}. $$