Probability Bounds and Central Limit Theorems Study Notes

Probability and Stochastic Processes

Unit III: Probability Bounds and Central Limit Theorems

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203

Introduction

This unit discusses probability bounds, which are inequalities applicable in various scenarios.
- Purpose of using inequalities:
- Insufficient information to calculate a desired quantity (e.g., probability of an event, expected value of a random variable).
- Complicated problems where exact calculations are difficult.
- Provide general results applicable to a wide range of problems.
In this section, µ and σ² represent the mean and variance of the random variable under consideration.

Markov’s Inequality

General Theme

Focus on upper bounding tail probabilities, i.e., probabilities of the forms
- $P(X \, \geq cµ)$
- $P(X \, \leq cµ)$ .

Definition

Markov’s Inequality:
- Let X be a non-negative random variable.
- Then, for any a > 0 :
  $P(X \, \geq a) \leq \frac{E(X)}{a}$ .

Proof of Markov’s Inequality

Discrete Random Variable

Assume X is a discrete random variable.
The expected value is defined as: $E[X] = \sum_x xP(X = x)$
- This can be split into two parts:
 \sum_{x<a} xP(X=x) + \sum_{x\geq a} xP(X=x).
Since X is non-negative,
$\sum_{x\geq a} xP(X=x) \geq a \sum_{x\geq a} P(X=x).$
Rearranging yields:
$P(X \geq a) \leq \frac{E(X)}{a}.$

Continuous Random Variable

Assume X is a continuous random variable.
The expected value is defined as:
$E(X) = \int_{-\infty}^{\infty} xf(x)dx = \int_{0}^{\infty} xf(x)dx \, (x \geq 0).$
The inequality is similarly derived:
$P(X \geq a) \leq \frac{E(X)}{a}.$

Examples of Markov’s Inequality

Example 1

Let $X \sim B(n, p)$ .
Apply Markov's inequality to find an upper bound on $P(X \geq \alpha n)$ , where p < \alpha < 1 .
Solution:
- $E(X) = np$ .
- $P(X \geq \alpha n) \leq \frac{np}{\alpha n} = \frac{p}{\alpha}.$
- With $p = \frac{1}{2}$ and $\alpha = \frac{3}{4}$ , it follows that
  $P(X \geq \frac{3n}{4}) \leq \frac{2}{3}.$

Example 2

Let $X \sim E(\lambda)$ .
Determine an upper bound for $P(X \geq a)$ , where a > 0 :
- The p.d.f. of $X$ is given by:
  $f(x) = \lambda e^{-\lambda x} \text{ for } x \geq 0$ .
- $E(X) = \frac{1}{\lambda}$ .
- Thus, using Markov's inequality:
  $P(X \geq a) \leq \frac{1/\lambda}{a} = \frac{1}{\lambda a}.$
The actual value is given by:
$P(X \geq a) = \int_{a}^{\infty} \lambda e^{-\lambda x}dx = e^{-\lambda a}.$

Chebyshev’s Inequality

Definition

Chebyshev’s Inequality:
- If X is a random variable with
- Mean $E(X) = \mu$ and
- Variance $Var(X) = \sigma^2$ ,
- Then:
 $P(|X − \mu| \geq a) \leq \frac{\sigma^2}{a^2}$ ,
- or:
 P(|X − \mu| < a) \geq 1 - \frac{\sigma^2}{a^2} for any a > 0 .

Explanation

Chebyshev’s Inequality states that the probability of the difference between X and $E(X)$ is limited by the variance of X.

Alternative Forms

If we let $a = kσ$ where k > 0 then:
- $P(|X − \mu| \geq kσ) \leq \frac{1}{k^2}.$
- Conversely:
 P(|X − \mu| < kσ) \geq 1 - \frac{1}{k^2}.

Example 3

Show by Chebyshev’s inequality that for 2000 throws of a coin, the probability that the number of heads lies between 900 and 1100 is at least $\frac{19}{20}$ :
- Let X denote the number of heads in 2000 throws of a coin:
- $E(X) = 2000 \times \frac{1}{2} = 1000$ ,
- $Var(X) = 2000 \times \frac{1}{2} \times \frac{1}{2} = 500$ .
- Therefore,
 P(900 < X < 1100) = P(-100 < X − 1000 < 100) = 1 - P(|X − 1000| \geq 100).
- By Chebyshev's inequality:
 $P(|X − 1000| \geq 100) \leq \frac{500}{100^2} = \frac{1}{20}.$
- Thus,
 P(900 < X < 1100) \geq 1 - \frac{1}{20} = \frac{19}{20}.

Example 4

For a random variable X with density function: $f(x) = \begin{cases} e^{-x}, & x \geq 0 \ 0, & \text{elsewhere} \end{cases}$ .
- Show that Chebyshev's inequality provides:
- $P(|X - 1| \geq 2) \leq \frac{1}{4}$ , while the actual probability is $e^{-3}$ :

Calculate the mean:
- $\mu = E(X) = \int_{0}^{\infty} xe^{-x}dx = 1$ .
Calculate the second moment:
- $E(X^2) = \int_{0}^{\infty} x^2 e^{-x}dx = 2$
Therefore,
- $Var(X) = E(X^2) - \mu^2 = 2 - 1 = 1$ .
Using Chebyshev’s inequality for any a > 0 :
- $P(|X - \mu| \geq a) \leq \frac{1}{a^2}.$
Taking $a = 2$ :
- $P(|X - 1| \geq 2) \leq \frac{1}{4}.$
Calculating the actual probability:
- P(|X - 1| \geq 2) = 1 - P(-1 < X < 3) = 1 - \int_{0}^{3} e^{-x}dx = 1 - (1 - e^{-3}) = e^{-3}.

Additional Concepts

Laws of Large Numbers

Convergence in Probability

A sequence of random variables $X_1, X_2, ext{…}, X_n$ converges in probability to $\alpha$ , if:
- For every \epsilon > 0 ,
 \lim_{n \to \infty} P(|X_n - \alpha| < \epsilon) = 1
- OR
 $\lim_{n \to \infty} P(|X_n - \alpha| \geq \epsilon) = 0$ .

Weak Law of Large Numbers (WLLN)

Let $X_1, X_2, …, X_n$ be a sequence of random variables with respective means $\mu_1, \mu_2, …, \mu_n$ .
Let $\bar{X}_n = \frac{X_1 + X_2 + \ldots + X_n}{n}$
- Then:
- $P(|\bar{X}_n − \bar{\mu}_n| ≥ \epsilon) \xrightarrow{p} 0$
- provided:
- $\lim \frac{B_n}{n^2} \rightarrow 0$ , where B_n = Var(X_1 + X_2 + … + X_n) < \infty .

Conditions for WLLN

To verify WLLN holds:
(i) $E(X_i)$ exists for all $i$ .
(ii) $B_n = Var(X_1 + ext{…} + X_n)$ exists.
(iii) $\lim_{n \to \infty} \frac{B_n}{n^2} \rightarrow 0$ .

Example 9

Check if WLLN holds for the sequence ${X_i}$ defined by
$P(X_k = \pm 2^k) = 2^{-(2k+1)}; \, P(X_k = 0) = 1 - 2^{-2k}.$
Calculate:
- $E(X_k) = \sum X_k P(X_k) = 0$
- $Var(X_k) = E(X_k^2) = 1$ ,
Thus,
B_n = Var(X_1 + … + X_n) = n < \infty .
Therefore,
- $\lim_{n \to \infty} \frac{B_n}{n^2} = 0 \Rightarrow WLLN holds$ .

Central Limit Theorem (CLT)

Definition

The Central Limit Theorem states:
- The normal distribution is the limiting distribution of the sum of independent random variables with finite variance.
For $n ≥ 1$ , let $X_1, X_2, …, X_n$ be independent, identically distributed random variables with finite mean $\mu$ and variance $\sigma^2$ , then:
- $S_n = X_1 + X_2 + … + X_n$
- $\bar{X}_n = \frac{S_n}{n}$
As $n → ∞$ :
(a)
P\left( \frac{S_n - n\mu}{\sqrt{n}\sigma} ≤ x \right) \rightarrow \frac{1}{\sqrt{2C0}} \int_{-\infty}^{x} e^{-\frac{t^2}{2}} dt, \text{ for all } x \in \mathbb{R}.
(b)
$P\left( \frac{\sqrt{n}(\bar{X}n - \mu)}{\sigma} ≤ x \right) \rightarrow \frac{1}{\sqrt{23C0}} \int{-\infty}^{x} e^{-\frac{t^2}{2}} dt, \text{ for all } x \in \mathbb{R}.$
In summary:
- For large n:
- $S_n ext{ follows } N(N\mu, n\sigma^2)$ ,
- $\bar{X}_n ext{ follows } N(\mu, \frac{\sigma^2}{n})$ .

Remark on CLT Usage

For iid random variables, if n is large enough,
- $S_n$ follows normal distribution with mean $n ext{µ}$ and variance $n ext{σ²}$ ;
- $\bar{X}_n$ follows normal distribution with mean $µ$ and variance $\frac{σ²}{n}$ .

Examples of CLT

Example 12

For a coin tossed 200 times, find the probability of number of heads between 80 and 120:

Let $X$ = number of heads. Then $X \sim B(200, \frac{1}{2})$ .
Thus,
- Mean:
  $\mu = np = 200 \times \frac{1}{2} = 100$
- Variance:
  $\sigma^2 = np(1-p) = 200 \times \frac{1}{2} \times \frac{1}{2} = 50$ .
Apply CLT:
- Find
  $Z = \frac{X - 100}{\sqrt{50}}$ follows standard normal distribution.
Therefore, the probability is:
P(80 < X < 120) = P\left(-2.82 < Z < 2.82\right) = 0.9952 .

Example 13

Given 75 Poisson variates with parameter 2, estimate P(120 < S_n < 160) :

For $X_i \sim Poisson(2)$ ,
- $E(X_i) = 2$ , $Var(X_i) = 2$ .
By CLT:
- $S_n ext{ follows } N(150, 150)$ .
The z-score transition leads to
P(120 < S_n < 160) = P(-2.4495 ≤ Z ≤ 0.8165) = 0.7868.

Example 14

For IID random variables with
$μ = 3 \text{ and } Var(X_i) = \frac{1}{2},$ estimate $P(340 ≤ S_n ≤ 370)$ for $n = 120$ :

Therefore,
- $S_n \sim N(360, 60)$ .
Proceed similarly as before for CDF calculations:
- $P(340 ≤ S_n ≤ 370) = P(-2.582 ≤ Z ≤ 1.291) = 0.8966$ .

Example 15

For the lifetime of bulbs with
- $E(Xi) = 1200 ext{ hrs}$ , $σ = 250 ext{ hrs}$ , find:
- Probability that average lifetime of 60 bulbs exceeds 1250 hrs:

By CLT, $\bar{X} \sim N(1200,\frac{250^2}{60})$ .
Required probability:
- P(\bar{X} > 1250) = P\left( Z > \frac{1250 - 1200}{250/\sqrt{60}} \right) = P(Z > 1.55) = 0.0606.

Strong Law of Large Numbers

For independent, identically distributed random variables with a finite expected value $ E(X_i) = \mu < \infty
- It holds with probability 1 that as $n → ∞$ , $\bar{X} → \mu$
- i.e. $P( lim_{n \to \infty} \bar{X} = \mu) = 1$ .

One-sided Chebyshev’s Inequality

Definition

For random variable X with mean 0 and finite variance $ σ^2
- For any c > 0 :
- $P(X \geq c) \leq \frac{σ^2}{σ^2 + c^2}.$

Example 16

Given items produced with mean 100 and variance 400, find $P(X ≥ 120)$ :

Using one-sided Chebyshev’s:
- $P(X ≥ 120) \leq \frac{400}{400 + (20)^2} = \frac{1}{2}.$
- From Markov’s inequality, $P(X ≥ 120) \leq \frac{100}{120} = \frac{5}{6}$ .

Cauchy-Schwarz Inequality

Definition

If X and Y are two random variables such that $E(X^2)$ , $E(Y^2)$ , and $E(XY)$ exist, then:
- ${E(XY)}^2 \leq E(X^2) E(Y^2)$ .
- Equality holds if and only if:
- $E(X^2) = 0$ , or $P(Y − aX = 0) = 1$ for some constant a.

Example 17

Utilize Cauchy-Schwarz to prove:

- $|ρ(X, Y)| ≤ 1$ , where ρ = correlation coefficient.

Integrable Random Variables

Definition

A random variable X is integrable if its expected value exists, i.e., E(X) < \infty .

Convex Function

Definition

A twice-differentiable function $g(x)$ is convex if the second derivative exists and is non-negative within its domain, i.e., $g''(x) ≥ 0$ .

Jensen’s Inequality

Jensen’s Inequality:
- For integrable random variable X and a convex function $g : R → R$ ,
- Then:
  $E{g(X)} ≥ g(E(X)).$

Example 18

Let X have mean 10, find:
- $E\left( \frac{1}{X + 1} \right).$

Let $g(x) = \frac{1}{x + 1}$
The second derivative $g''(x) > 0$ for x > -1 implies convexity:
Therefore:
$E\left( \frac{1}{X + 1} \right) ≥ \frac{1}{E(X) + 1} = \frac{1}{11}.$

Moment Generating Function

Definition

The moment generating function (M.G.F.) of a random variable X is defined as:
$M_X (s) = E(e^{sX}).$

Chernoff Bounds

Definition

For any random variable X and real number a,
- We can express:
 P(X \geq a) = P(e^{tX} \geq e^{ta}), \text{ for } t > 0;
 P(X \leq a) = P(e^{tX} \geq e^{ta}), \text{ for } t < 0.
Apply Markov’s inequality:
- For t > 0:
  $P(X \geq a) \leq E(e^{tX}) e^{-ta}.$
Similarly for t < 0 :
$P(X \leq a) \leq E(e^{tX}) e^{-ta}.$

Conclusion

Thus, we establish:
P(X \geq a) \leq ext{Min}{t>0} \left{ e^{-ta} M_X(t) \right}, ext{ for all } t > 0 P(X \leq a) \leq ext{Min}{t<0} \left{ e^{-ta} M_X(t) \right}, ext{ for all } t < 0.

Example 19

For a standard normal variable Z with its M.G.F.:
$M_Z(t) = e^{ rac{t^2}{2}} .$
Estimate $P(Z ≥ 2)$ using Chernoff bounds:

P(Z ≥ 2) ≤ ext{Min} \left{ e^{-2t} M_Z(t) \right} .
Minimize the function:
$e^{ rac{t^2}{2} - 2t}.$
Take derivative to identify critical points.
Get best upper bound $P(Z ≥ 2) ≤ e^{-2} = 0.1353.$

Comparing with one-sided Chebyshev’s:
$P(Z ≥ 2) ≤ \frac{1}{5}.$

Example 20

For a Poisson variate $X$ with parameter $ext{λ} = 2$ :

Calculate $P(X ≥ 3)$ using Chernoff bounds:
P(X ≥ 3) ≤ ext{Min} \left{ e^{-3t} M_X(t) \right}.
Establish CDF and compare with Markov's Inequality.
Construct the minimizing function and identify critical points accordingly.

Thank You

Session concluded with acknowledgments from

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203.

Example 5

Let $X ext{ be a random variable with } E(X) = 10$ .
Find an upper bound for $P(X ext{ } ext{is } ext{ at } ext{ least 20})$ :
- Using Markov's inequality:
  P(X ext{ } ext{is } ext{ at } ext{ least } 20) iggr ext{ } \ ext{ }iggr ext{ } ext{ }iggr ext{ } ext{ } ext{ }iggr ext{ }iggr ext{ } \ ext{ } ext{ }iggr ext{ }iggr ext{ } \ ext{ } ext{ } iggr ext{ }iggr ext{ } \ ext{ } = rac{E(X)}{20} = rac{10}{20} = rac{1}{2}.

Example 6

Let Y ext{ be a random variable that takes values } 0 ext{ and } 3 $with probabilities 0.5 and 0.5, respectively.</li><li>Calculate an upper bound for$ P(Y = 3) :
- Here,
 E(Y) = 0.5 imes 0 + 0.5 imes 3 = 1.5.
- So, applying Markov's inequality:
 P(Y ext{ } ext{is } ext{ at } for ext{ least } 3) = rac{1.5}{3} = 0.5. $</li></ul></li></ul>Example 7<ul><li>Assume$ Z ext{ follows a Uniform distribution on } [0, 10] $.</li><li>Thus,$ E(Z) = rac{0 + 10}{2} = 5 $.</li><li>Find an upper bound for$ P(Z ext{ > } 8) :
 - Using Markov's inequality,
 P(Z ext{ > } 8) iggr ext{ }iggr ext{ } = ext{ } ext{ at } \ ext{ } iggr ext{ }iggr ext{ } = rac{5}{8}. $</li></ul></li></ul>Example 8<ul><li>For a random variable$ W $with expected value$ E(W) = 4 $, find an upper bound for$ P(W > 5) :
 - Applying Markov’s inequality gives:
 P(W > 5) iggr ext{ } = rac{E(W)}{5} = rac{4}{5}. $$

Probability Bounds and Central Limit Theorems Study Notes

Probability and Stochastic Processes

Unit III: Probability Bounds and Central Limit Theorems

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203

Introduction

Markov’s Inequality

General Theme

Definition

Proof of Markov’s Inequality

Examples of Markov’s Inequality

Example 1

Example 2

Chebyshev’s Inequality

Definition

Explanation

Alternative Forms

Example 3

Example 4

Additional Concepts

Laws of Large Numbers

Central Limit Theorem (CLT)

Definition

Remark on CLT Usage

Examples of CLT

Example 14

Example 15

Strong Law of Large Numbers

One-sided Chebyshev’s Inequality

Definition

Example 16

Cauchy-Schwarz Inequality

Definition

Example 17

- ∣ρ(X,Y)∣≤1|ρ(X, Y)| ≤ 1∣ρ(X,Y)∣≤1, where ρ = correlation coefficient.

Integrable Random Variables

Definition

Convex Function

Definition

Jensen’s Inequality

Example 18

Moment Generating Function

Definition

Chernoff Bounds

Definition

Conclusion

Example 19

Example 20

Thank You

Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur-603203.

- $|ρ(X, Y)| ≤ 1$ , where ρ = correlation coefficient.