Comprehensive Study Notes: Stochastic Processes and the Poisson Process

Chapter 4: The Poisson Process

The Poisson process is considered one of the most fundamental models in probability theory and stochastic processes. It provides a mathematical description for the occurrence of random events over time. Common examples include:

  • The arrival of customers at a service desk.

  • The emission of particles from a radioactive source.

  • Phone calls incoming to a call centre.

Two central ingredients underlie the Poisson process:

  • The number of arrivals in a given time interval follows a Poisson distribution.

  • The waiting time between successive arrivals is exponentially distributed.

4.1 The Exponential Distribution

The exponential distribution is the canonical model for the waiting time until the first occurrence of a random event that happens at a constant average rate.

Definition 4.1.1

A random variable X:ΩRX : \Omega \rightarrow \mathbb{R} is said to follow an exponential distribution with parameter \lambda > 0 if:

P(X[a,b])=abλeλβdβP(X \in [a, b]) = \int_a^b \lambda e^{-\lambda\beta} \, d\beta

for all 0ab0 \leq a \leq b. Equivalently, XX has the probability density function:

fX(x)=λeλxf_X(x) = \lambda e^{-\lambda x}

for all x0x \geq 0. The notation used is Xexp(λ)X \sim \exp(\lambda).

Lemma 4.1.2: Properties of the Exponential Distribution

Let XX be exponentially distributed with parameter \lambda > 0. Then:

  • (a) Cumulative Distribution Function (CDF):   FX(β)={1eλβamp;if β00amp;elseF_X(\beta) = \begin{cases} 1 - e^{-\lambda\beta} & \text{if } \beta \geq 0 \\ 0 & \text{else} \end{cases}

  • (b) Expectation and Variance:   E[X]=1λE[X] = \frac{1}{\lambda}   Var(X)=1λ2\text{Var}(X) = \frac{1}{\lambda^2}

  • (c) Moment Generating Function (MGF):   GX(β):=E[eβX]=λλβG_X(\beta) := E[e^{\beta X}] = \frac{\lambda}{\lambda - \beta}   for all \beta < \lambda.

Proof of Lemma 4.1.2:

  • (a) For β0\beta \geq 0:   FX(β)=0βfX(α)dα=0βλeλαdα=1eλβF_X(\beta) = \int_0^\beta f_X(\alpha) \, d\alpha = \int_0^\beta \lambda e^{-\lambda\alpha} \, d\alpha = 1 - e^{-\lambda\beta}.

  • (b) Integrating by parts with u=βu = \beta, dv=λeλβdβdv = \lambda e^{-\lambda\beta} d\beta:   E[X]=0βλeλβdβ=[βeλβ]0+0eλβdβ=1λE[X] = \int_0^\infty \beta \lambda e^{-\lambda\beta} \, d\beta = [-\beta e^{-\lambda\beta}]_0^\infty + \int_0^\infty e^{-\lambda\beta} \, d\beta = \frac{1}{\lambda}.   Similarly, E[X2]=0β2λeλβdβ=2λ2E[X^2] = \int_0^\infty \beta^2 \lambda e^{-\lambda\beta} \, d\beta = \frac{2}{\lambda^2}. Thus, Var(X)=2λ2(1λ)2=1λ2\text{Var}(X) = \frac{2}{\lambda^2} - (\frac{1}{\lambda})^2 = \frac{1}{\lambda^2}.

  • (c) For \beta < \lambda:   GX(β)=0eβαλeλαdα=λ0e(λβ)αdα=λλβG_X(\beta) = \int_0^\infty e^{\beta \alpha} \lambda e^{-\lambda \alpha} \, d\alpha = \lambda \int_0^\infty e^{-(\lambda-\beta)\alpha} \, d\alpha = \frac{\lambda}{\lambda - \beta}.

Theorem 4.1.3: Lack of Memory Property

An exponential random variable has a unique feature known as the lack-of-memory (or memoryless) property. Informally: "If we have already waited tt units of time, the probability of having to wait at least ss further units is the same as if no time had elapsed."

If Xexp(λ)X \sim \exp(\lambda), then for all s,t0s, t \geq 0:

P(X > t + s | X > t) = P(X > s)

Proof: By Lemma 4.1.2, P(X > t) = 1 - F_X(t) = e^{-\lambda t}. Thus:

P(X > t + s | X > t) = \frac{P(X > t + s)}{P(X > t)} = \frac{e^{-\lambda(t+s)}}{e^{-\lambda t}} = e^{-\lambda s} = P(X > s).

Among all continuous probability distributions with a density, the exponential is the only one possessing this property. In the discrete setting, the geometric distribution is the unique distribution with this property.

Examples of the Exponential Distribution
  • Example 4.1.4 (Bank Session): Let XX denote the time (in minutes) a customer spends in a bank, assumed exponential with mean 10 minutes (implying λ=110\lambda = \frac{1}{10}).

    • Probability spending more than 15 minutes: P(X > 15) = e^{-\lambda(15)} = e^{-3/2} \approx 0.220.

    • Probability spending more than 15 minutes given the customer is still there after 10 minutes: P(X > 15 | X > 10) = P(X > 5) = e^{-\lambda(5)} = e^{-1/2} \approx 0.604.

  • Example 4.1.5 (Lightbulb Lifetime): Lifetime XX (in hours) is exponential with mean 10 (Xexp(1/10)X \sim \exp(1/10)). You find the bulb working. The probability it lasts another 5 hours is P(X > t + 5 | X > t) = P(X > 5) = e^{-5/10}. If XX were not exponential, the probability would depend on tt: 1FX(t+5)1FX(t)\frac{1 - F_X(t + 5)}{1 - F_X(t)}.

Mathematical Properties of Exponential Variables
  • Proposition 4.1.6 (Sum of Exponentials): Let X1,,XnX_1, \dots, X_n be independent exp(λ)\exp(\lambda) random variables. Then Sn=X1++XnS_n = X_1 + \dots + X_n follows a Gamma distribution with shape parameter nn and rate parameter λ\lambda. The density is:   fSn(β)=λn(n1)!βn1eλβf_{S_n}(\beta) = \frac{\lambda^n}{(n-1)!} \beta^{n-1} e^{-\lambda \beta}

  • Proposition 4.1.7 (Minimum of Exponentials): Let X1,,XnX_1, \dots, X_n be independent exp(λk)\exp(\lambda_k) variables. Then M=min(X1,,Xn)M = \min(X_1, \dots, X_n) is exponentially distributed with parameter k=1nλk\sum_{k=1}^n \lambda_k. Interpretation: If nn independent exponential clocks start with rates λk\lambda_k, the first clock to ring occurs after an exponential time with the sum of the rates.

  • Proposition 4.1.8 (Competition of Exponentials): Let Xexp(λ)X \sim \exp(\lambda) and Yexp(μ)Y \sim \exp(\mu) be independent. Then:   P(X < Y) = \frac{\lambda}{\lambda + \mu}   Proof: Using joint density fX,Y(α,β)=λeλαμeμβf_{X,Y}(\alpha, \beta) = \lambda e^{-\lambda \alpha} \mu e^{-\mu \beta}, we compute:   P(X < Y) = \int_0^\infty \int_\alpha^\infty \lambda e^{-\lambda \alpha} \mu e^{-\mu \beta} \, d\beta \, d\alpha = \int_0^\infty \lambda e^{-\lambda \alpha} e^{-\mu \alpha} \, d\alpha = \frac{\lambda}{\lambda + \mu}.

Application Examples
  • Example 4.1.9 (Post Office): Two clerks are busy, service times are exp(λ1)\exp(\lambda_1) and exp(λ2)\exp(\lambda_2). You are served when one becomes available. Total time T=min(R1,R2)+ST = \min(R_1, R_2) + S, where RiR_i is remaining service time and SS is your own service time. Due to memorylessness, Riexp(λi)R_i \sim \exp(\lambda_i). Expected time:   E[T]=E[min(R1,R2)]+E[S]=1λ1+λ2+2λ1+λ2=3λ1+λ2E[T] = E[\min(R_1, R_2)] + E[S] = \frac{1}{\lambda_1 + \lambda_2} + \frac{2}{\lambda_1 + \lambda_2} = \frac{3}{\lambda_1 + \lambda_2}.

  • Example 4.1.10 (Competing Tasks): Check-up (mean 20 mins, λ1=1/20\lambda_1 = 1/20) and Consultation (mean 30 mins, λ2=1/30\lambda_2 = 1/30) start at the same time.

    • Probability check-up finishes first: P(R_1 < R_2) = \frac{1/20}{1/20 + 1/30} = \frac{3}{5}.

    • Expected time until both finish (T=max(R1,R2)T = \max(R_1, R_2)): E[T] = E[\min(R_1, R_2)] + P(R_1 < R_2)E[R_2] + P(R_2 < R_1)E[R_1].   E[T]=12+35(30)+25(20)=38 minutesE[T] = 12 + \frac{3}{5}(30) + \frac{2}{5}(20) = 38 \text{ minutes}.

4.2 The Poisson Distribution

The Poisson distribution models the number of events in a fixed interval of time or space occurring independently at a constant average rate. It is often described as the "law of rare events."

Definition 4.2.1

A random variable XX is Poisson distributed, written XPoi(λ)X \sim \text{Poi}(\lambda), if:

P(X=k)=eλλkk!P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}

for all kZ0k \in \mathbb{Z}_{\geq 0}. The parameter λ\lambda represents the expected number of occurrences.

Proposition 4.2.2: Moments of Poisson

For XPoi(λ)X \sim \text{Poi}(\lambda), \lambda > 0:

  • (a) Factorial Moments: E[X(X1)(Xk+1)]=λkE[X(X - 1) \dots (X - k + 1)] = \lambda^k. Specifically, E[X]=λE[X] = \lambda.

  • (b) Variance: Var(X)=λ\text{Var}(X) = \lambda.

Proof of (b): Since E[X(X1)]=λ2E[X(X-1)] = \lambda^2, we have: Var(X)=E[X(X1)]+E[X](E[X])2=λ2+λλ2=λ\text{Var}(X) = E[X(X-1)] + E[X] - (E[X])^2 = \lambda^2 + \lambda - \lambda^2 = \lambda.

Proposition 4.2.3: Sum of Independent Poisson

If XPoi(λ)X \sim \text{Poi}(\lambda) and YPoi(ρ)Y \sim \text{Poi}(\rho) are independent, then X+YPoi(λ+ρ)X + Y \sim \text{Poi}(\lambda + \rho).

Proposition 4.2.4: Poisson as a Limit of Binomial

Let XnX_n be Binomially distributed with parameter nn and pn=λ/np_n = \lambda/n. As nn \rightarrow \infty:

limnP(Xn=k)=eλλkk!\lim_{n \rightarrow \infty} P(X_n = k) = \frac{e^{-\lambda} \lambda^k}{k!}

This reveals that the Poisson distribution is appropriate for modeling many independent trials with small success probabilities (pp) but a fixed expected number of successes (np=λnp = \lambda).

Example 4.2.5 (Approximation): For n=1000n = 1000, pn=0.002p_n = 0.002 (λ=2\lambda = 2). Comparing XnBin(1000,0.002)X_n \sim \text{Bin}(1000, 0.002) and YPoi(2)Y \sim \text{Poi}(2), the probabilities are nearly identical (absolute differences are 104\approx 10^{-4} to 00).

4.3 Poisson Process

A counting process (N(t):t0)(N(t) : t \geq 0) models the arrival of events over time.

Definition 4.3.2: Counting Process

A stochastic process is a counting process if:

  • (a) N(0)=0N(0) = 0

  • (b) N(t)0N(t) \geq 0 for all t0t \geq 0

  • (c) N(s)N(t)N(s) \leq N(t) whenever sts \leq t

Increment Properties
  • Independent Increments: Numbers of events in disjoint time intervals are independent random variables. (e.g., goals in the first half vs. second half).

  • Stationary Increments: The distribution of the number of events in an interval depends only on the length of the interval, not its position. (e.g., arrivals between 12-1 PM must be same as 3-4 PM).

Definition 4.3.4: Poisson Process

A counting process NN is a Poisson process with intensity \lambda > 0 if:

  • (a) N(0)=0N(0) = 0

  • (b) NN has independent and stationary increments.

  • (c) For every t0t \geq 0, N(t)Poi(λt)N(t) \sim \text{Poi}(\lambda t), specifically: P(N(t)=k)=eλt(λt)kk!P(N(t) = k) = \frac{e^{-\lambda t} (\lambda t)^k}{k!}.

The parameter λ\lambda represents the expected number of arrivals per unit time (E[N(t)]=λtE[N(t)] = \lambda t).

Arrival and Waiting Times
  • Definition 4.3.5:

    • Tk=inf{t0:N(t)=k}T_k = \inf\{t \geq 0 : N(t) = k\} is the kk-th arrival time.

    • τk=TkTk1\tau_k = T_k - T_{k-1} is the kk-th waiting time (T0:=0T_0 := 0).

  • Theorem 4.3.6:

    • (a) Waiting times τ1,τ2,\tau_1, \tau_2, \dots are i.i.d. exp(λ)\exp(\lambda).

    • (b) Arrival time TkT_k follows a Gamma distribution with parameters (k,λ)(k, \lambda), with density: fTk(β)=λk(k1)!βk1eλβf_{T_k}(\beta) = \frac{\lambda^k}{(k-1)!} \beta^{k-1} e^{-\lambda \beta}.

Theorem 4.3.9: Equivalent Characterizations

A counting process NN is a Poisson process with intensity λ\lambda if and only if:

  1. It has independent and stationary increments, and for small hh, P(N(h)=1)=λh+o(h)P(N(h) = 1) = \lambda h + o(h) and P(N(h)2)=o(h)P(N(h) \geq 2) = o(h).

  2. It results from independent exponentially distributed waiting times τk\tau_k.

Note on Little-o Notation (Definition 4.3.7): f=o(g)f = o(g) as β0\beta \rightarrow 0 if limβ0f(β)g(β)=0\lim_{\beta \rightarrow 0} \frac{f(\beta)}{g(\beta)} = 0.

  • Example: eh=1+h+o(h)e^h = 1 + h + o(h).

  • This allows defining the Poisson process by its infinitesimal behavior over very short intervals.

4.4 Compound Poisson Processes

A compound Poisson process attaches a random quantity (jump size) to each arrival of a Poisson process.

Definition 4.4.2

Let N(t)N(t) be a Poisson process with intensity λ\lambda and (Yk)k=1(Y_k)_{k=1}^\infty be an i.i.d. sequence of random variables independent of NN. The compound Poisson process S(t)S(t) is:

S(t)=k=1N(t)YkS(t) = \sum_{k=1}^{N(t)} Y_k

(If N(t)=0N(t)=0, then S(t)=0S(t)=0).

Theorem 4.4.4: Wald's Identity

If E[Y_1^2] < \infty, then for all t0t \geq 0:

  • E[S(t)]=λtE[Y1]E[S(t)] = \lambda t E[Y_1]

  • Var(S(t))=λtE[Y12]\text{Var}(S(t)) = \lambda t E[Y_1^2]

Example 4.4.5 (Marine Ecologist): Crabs emerge at scale λ=3\lambda = 3 per hour. Weights YkY_k have mean 4 lbs and standard deviation 2 lbs. Over 2 hours:

  • E[S(2)]=(3)(2)(4)=24 lbsE[S(2)] = (3)(2)(4) = 24 \text{ lbs}.

  • Var(S(2))=(3)(2)E[Y12]=(6)(Var(Y1)+E[Y1]2)=6(22+42)=6(20)=120 lbs2\text{Var}(S(2)) = (3)(2) E[Y_1^2] = (6)( \text{Var}(Y_1) + E[Y_1]^2 ) = 6(2^2 + 4^2) = 6(20) = 120 \text{ lbs}^2.

  • Standard deviation: 120\sqrt{120}.

4.5 Combining and Thinning

Theorem 4.5.1: Thinning (Splitting)

If each arrival of a Poisson process with rate λ\lambda is classified as type j{1,,M}j \in \{1, \dots, M\} with probability pjp_j, then the processes Nj(t)N_j(t) counting type-jj events are independent Poisson processes with intensities pjλp_j \lambda.

Example 4.5.2 (Fisherman): Fish bite at rate 2 per hour. 40% are salmon, 60% trout. Salmon process rate is (0.4)(2)=0.8(0.4)(2) = 0.8. Trout process rate is (0.6)(2)=1.2(0.6)(2) = 1.2. Probability of 1 salmon and 2 trout in 2.5 hours: P(Ns(2.5)=1)P(Nt(2.5)=2)=e2211!e3322!=9e5P(N_s(2.5)=1) \cdot P(N_t(2.5)=2) = \frac{e^{-2} 2^1}{1!} \cdot \frac{e^{-3} 3^2}{2!} = 9 e^{-5}.

Theorem 4.5.3: Superposition

The sum of mm independent Poisson processes with intensities λ1,,λm\lambda_1, \dots, \lambda_m is a Poisson process with intensity λ=k=1mλk\lambda = \sum_{k=1}^m \lambda_k.

Example 4.5.4 (Bus Stop): Routes A (λA\lambda_A) and B (λB\lambda_B).

  • Overall arrival rate: λA+λB\lambda_A + \lambda_B.

  • Expected wait for any bus: 1λA+λB\frac{1}{\lambda_A + \lambda_B}.

  • Probability next bus is A: λAλA+λB\frac{\lambda_A}{\lambda_A + \lambda_B}.

  • Expected A buses before the first B bus: Let p=λA/(λA+λB)p = \lambda_A/(\lambda_A + \lambda_B). The number of A labels before the first B is geometric with parameter (1p)(1-p), yielding mean p1p=λAλB\frac{p}{1-p} = \frac{\lambda_A}{\lambda_B}.