Poisson Processes

Poisson processes are used to model natural phenomena and provide analytical results. They are mathematically simple and offer a relevant fundamental understanding.

The Poisson Distribution

The Poisson distribution with parameter \mu > 0 is given by:
p_k = P[X = k] = e^{-\mu} \frac{\mu^k}{k!}, for k = 0, 1, 2, …

The mean and variance are:
E[X] = \mu
Var[X] = \mu

Sum of Poisson Variables

Theorem 4.1.1: If X and Y are independent random variables with Poisson distributions with parameters \mu and \xi respectively, then X + Y has a Poisson distribution with parameter \mu + \xi.

Proof:
P[X + Y = n] = \sum{k=0}^{n} P[X = k, Y = n - k] Since X and Y are independent: = \sum{k=0}^{n} P[X = k]P[Y = n - k]
= \sum{k=0}^{n} \frac{\mu^k e^{-\mu}}{k!} \frac{\xi^{n-k} e^{-\xi}}{(n - k)!} = e^{-(\mu+\xi)} \frac{1}{n!} \sum{k=0}^{n} \frac{n!}{k!(n - k)!} \mu^k \xi^{n-k}
= e^{-(\mu+\xi)} \frac{(\mu + \xi)^n}{n!}, for n = 0, 1, …

Composition of Poisson with Binomial

Theorem 4.1.2: Let N be a Poisson random variable with parameter \mu, and conditional on N, let M have a binomial distribution with parameters N and p. Then the unconditional distribution of M is Poisson with parameter \mu p.

The Poisson Process

A Poisson process describes how the count X of rare events changes over time t.

Definition 1 (Poisson process): A Poisson process of intensity or rate \eta > 0 is an integer-valued stochastic process {X(t) : t \subseteq 0} for which:

For any sequence of instants t0 = 0 < t1 < t2 < … < tn, the process increments X(t1) - X(t0), X(t2) - X(t1), …, X(tn) - X(t{n-1}) are independent and stationary random variables.
For s \subseteq 0 and t > 0, the random variables X(s + t) - X(s) have a Poisson distribution with rate \eta t:
P[X(s + t) - X(s) = k] = \frac{(\eta t)^k e^{-\eta t}}{k!}, for k = 0, 1, 2…
X(0) = 0.

If X(t) is a Poisson process of rate \eta > 0, then its mean and variance are given by:
E[X(t)] = \eta t
Var[X(t)] = \eta t

Exercise 4.2.1

Defects occur along an undersea cable according to a Poisson process of rate \eta = 0.1 per mile.

What is the probability that no defects appear in the first two miles of the cable?
X(2) has a Poisson distribution with parameter \eta t = 0.1 \cdot 2 = 0.2, and so:
P[X(2) = 0] = e^{-0.2} = 0.8187
Given that there are no defects in the first two miles of cable, what is the conditional probability of no defects between mile two and three?
Here we use the independence of X(3) - X(2) and X(2) - X(0) = X(2). So it follows that the conditional probability is the same as the unconditional probability:
P[X(3) - X(2) = 0] = P[X(1) = 0] = e^{-0.1} = 0.9048

Exercise 4.2.2

Customers arrive in a certain store according to a Poisson process of rate \eta = 4 per hour.

Given that the store opens at 9.00 AM, what is the probability that exactly one customer has arrived by 9.30 and a total of five have arrived by 11.30 AM?
We are asked to determine P[X(1/2) = 1, X(5/2) = 5]. Using the independence of X(5/2) - X(1/2) and X(1/2), we reformulate the request as:
P[X(1/2) = 1, X(5/2) = 5] = P[X(1/2) = 1, X(5/2) - X(1/2) = 4]
= e^{-4(1/2)} \frac{{4(1/2)}^1}{1!} \cdot e^{-4(2)} \frac{{4(2)}^4}{4!} = (2e^{-2}) \frac{512}{3} e^{-8} = 0.0155

Non-homogeneous processes

A generalization of the Poisson process definition is to relax the stationarity hypothesis, by letting the rate \eta be a function of time: \eta(t). The probability of a single event occurring in an infinitesimal interval h of time is proportional to \eta:
P[X(t + h) - X(t) = 1] = \frac{(\eta h) e^{-\eta h}}{1!} = (\eta h)(1 - \eta h + O(h^2)) = \eta h + o(h)
The probability of k events happening in a time interval (s, s + t] would then be given by:
P[X(t + s) - X(s) = k] = \frac{1}{k!} \int_{s}^{t+s} (\eta(t)t)^k e^{-\alpha(t)t}

The Law of Rare Events

The Poisson distribution is the “discrete analog” of the normal distribution. The Law of Rare Events states that the total number of these rare events that do happen follows (approximately) a Poisson distribution.

Law of Rare Events, fixed probability case

Consider an experiment with a low and fixed probability p << 1 of success, which is repeated a high number N of times. In this case, the total number of successes X{N,p} after N trials follows a binomial distribution: P[X{N,p} = k] = \frac{N!}{k!(N - k)!} p^k (1 - p)^{N-k}, for k = 0, …, n In the limit of rare events p \to 0 and infinite trials N \to +\infty, with a fixed success rate Np = \mu > 0, X{N,p} will follow the Poisson distribution: P[X{\mu} = k] = e^{-\mu} \frac{\mu^k}{k!}, for k = 0, 1, 2, …

Law of Rare Events: general case

Theorem 4.3.1: Let \nu1, \nu2,… be independent Bernoulli random variables, where:

P[\nui = 1] = pi
P[\nui = 0] = 1 - pi

and let Sn = \nu1 + ··· + \nun. The distribution of Sn is given by:

P[Sn = k] = \sum{xi = \pm 1, x1+\cdots+xn=k} \prod{i=1}^{n} pi^{xi} (1 - pi)^{1-xi}

which differs from a Poisson distribution with rate \mu = p1 + ··· + pn by at most:

|P[Sn = k] - \frac{\mu^k e^{-\mu}}{k!}| \le \sum{i=1}^{n} p_i^2

In particular, if all p_i \to p and Np = \mu is kept fixed, in the limit N \to +\infty, p = \mu/N \to 0, and so does the RHS of 4.4.

An analog of the Law of Rare Events holds for stochastic processes, stating that the total counts of events generated by many independent processes can be approximately described by a single Poisson process.

Properties of Poisson Processes

Sum of Poisson processes

Theorem 4.4.1: Let X1(t) and X2(t) be two independent Poisson Processes with rates \eta1, \eta2. Then, the variable that counts both of them X(t) = X1(t) + X2(t) is a Poisson process itself with rate \eta = \eta1 + \eta2.

Proof:

To prove that X(t) is a Poisson process, we verify the requirements in definition 1:

At the starting time t = 0, the number of events counted for both processes will be zero (X1(0) = X2(0) = 0), and so their sum: X(0) = 0
Since X1 and X2 have stationary and independent increments separately, so does X that is their sum.
Given the fact that the random variables X1(t) and X2(t) are Poisson distributed with parameter \eta1 t and \eta2 t and are independent of each other, their sum X(t) = X1(t) + X2(t) is therefore a Poisson distribution with parameter (\eta1 t + \eta2 t) as a consequence of theorem 4.1.1.

Filtering a Poisson process

Theorem 4.4.2: Let X(t) be a Poisson Process with rate \eta and let each event be independently marked as either type 1 with probability p, or type 2 with probability 1 - p. Then, the events of the type 1 and 2 follow two independent Poisson Processes with rates \eta p and \eta(1 - p).

Proof:

As before, we start by verifying the 3 requirements of definition 1:

When we start counting since there are no events at all, so X1(0) = X2(0) = 0
Since X has stationary and independent increments and the marking of events occurs independently, then X1 and X2 inherit from X the stationarity and independence of their respective increments.
The joint distribution of the number of arrivals in the two sub-processes before a fixed time t is:
P[X1(t) = n, X2(t) = m] = P[X_1(t) = n|X(t) = n + m] P[X(t) = n + m]
= {n + m \choose n} p^n (1 - p)^m \frac{e^{-\eta t} (\eta t)^{n+m}}{(n + m)!}
= \frac{(n + m)!}{n! m!} p^n (1 - p)^m \frac{e^{-\eta p t} e^{-\eta(1-p) t} (\eta t)^{n+m}}{(n + m)!}
= \frac{(\eta p t)^n e^{-\eta p t}}{n!} \frac{(\eta t (1 - p))^m e^{-\eta(1-p) t}}{m!}

From the last point we know that X1(t) and X2(t) are independent random variables when they are evaluated at the same instant t.

Other distributions

Depending on which aspect of the process we focus on, different distributions emerge.

Inter-arrival times

Let’s consider the arrival times (or waiting times) Wi at which a new event occurs, and the count X goes up by 1. We define the inter-arrival times (or sojourn times) Si as the difference between two consecutive arrival times:

Si = W{i+1} - W_i

Clearly:

Wi = \sum{k=0}^{i-1} S_k

For the inter-arrival times, we already know the answer:

Theorem 4.5.1: Inter-arrival times S_i are i.i.d. exponential random variables with rate \eta.

Waiting times

Theorem 4.5.2: The waiting time W_n, i.e. the time needed for the n-th event to occur, has the gamma distribution whose probability density function is:

f{Wn}(t) = \frac{\eta^n t^{n-1}}{(n - 1)!} e^{-\eta t}

If we fix the number n of events occurring in the time interval (0,t), then the joint probability of the arrival times {Wi}{i=1,…,n} is that of an ordered sequence of uniformly chosen points.

Uniform distribution

Suppose we choose independently n points U_i uniformly in the interval (0,t).

The joint distribution of {Ui}{i=1,…,n} is given by the product of n uniform pdfs.

Let’s call {Wi} the sequence of ordered Ui, with 0 < W1 < W2 < ··· < W_n < t. Note that the Wi are not independent of each other, since they must satisfy the ordering.

Let us consider the probability of W1 and W2 respectively being inside two small intervals [w1, w1 + \Omega w1] and [w2, w2 + \Omega w2].

f{W1,W2} (w1, w2) \Omega w1 \Omega w2 = P[w1 < W1 < w1 + \Omega w1, w2 < W2 < w2 + \Omega w_2]

= P[w1 < U1 < w1 + \Omega w1, w2 < U2 < w2 + \Omega w2] + P[w1 < U2 < w1 + \Omega w1, w2 < U1 < w2 + \Omega w2]

P[w1 < U1 < w1 + \Omega w1, w2 < U2 < w2 + \Omega w2] = \frac{\Omega w1}{t} \frac{\Omega w2}{t}

f{W1,W2} (w1, w2) \Omega w1 \Omega w2 = 2 \frac{\Omega w1}{t} \frac{\Omega w2}{t} = \frac{2}{t^2} \Omega w1 \Omega w_2

f{W1,W2} (w1, w_2) = \frac{2}{t^2}

Generalizing this argument up to n elements, the number of permutations of n elements U1, U2…Un is n!, whereas the joint pdf will be proportional to t^{-n}. So, the joint probability density function for W1, W2, … , Wn is given by:

f{W1,W2…Wn} (w1, w2, …, wn) = \frac{n!}{t^n}, for 0 < w1 < w2 < … < wn < t

Theorem 4.5.3: Let W1, W2, … be the ordered occurrence times in a Poisson process of rate \eta > 0. Let us condition on X(t) = N(t) = n, that is the fact that in interval (0,t) we observe exactly n events. Given their number, the arrival times of n events {W1, W2, … , W_n} have the joint probability density function:

f{W1,…,Wn|X(t)=n}(w1, … , wn) = \frac{n!}{t^n}, for 0 < w_1 < ··· < wn < t

Proof:

Let’s first assume that all w_i’s are distinct.

P[wi < Wi < wi + \Omega wi, i = 1, …, n|X(t) = n] = f{W1,…,Wn|X(t)=n}(w1, … , wn)\Omega w1 ··· \Omega wn + o(\Omega w1, … , \Omega wn)

P[wi < Wi < wi + \Omega wi, i = 1, …, n and zero arrivals everywhere else in [0,t] | X(t) = n]

= \frac{\eta \Omega w1 e^{-\eta \Omega w1} \eta \Omega w2 e^{-\eta \Omega w2} …\eta \Omega wn e^{-\eta \Omega wn} e^{-\eta (t - \sum{i=1}^{n} \Omega wi)}}{\frac{e^{-\eta t} (\eta t)^n}{n!}}

= \frac{n!}{t^n} \Omega w1…\Omega wn

Suppose now that n events have happened in (0,t). Then, the probability of k < n events happening in (0, u), with u < t, is given by a Binomial distribution:

Theorem 4.5.4: Let X(t) be a Poisson process with rate \eta. Given the fact we know that in the interval (0,t) we have n arrivals, that is X(t) = n, we want to find the probability that the number of arrivals in a subset 0 <u<t is 0 < k < n. Then, in formulas:

P[X(u) = k|X(t) = n] = {n \choose k} (\frac{u}{t})^k (1 - \frac{u}{t})^{n-k}

Theorem 4.5.5: Let X1(t), X2(t) be two concurrent independent Poisson processes with rates \eta1, \eta2. Given the total number of arrivals in interval (0,t) i.e. X1(t) + X2(t) = n, the probability of having k arrivals in the first process is:

P[X1(t) = k|X1(t) + X2(t) = n] = {n \choose k} (\frac{\eta1}{\eta1 + \eta2})^k (\frac{\eta2}{\eta1 + \eta_2})^{n-k}

This probability is given by a binomial distribution. In fact, the probability p1 of a generic event belonging to 1 is the ratio between the rate \eta1 of 1, and the total rate \eta1 + \eta_2 of both processes.

Theorem 4.5.6: Let X1(t) and X2(t) be two independent Poisson processes with rates \eta1, \eta2 in the interval (0,t). Let s be a subset of t s.t. 0 ~~0}. Let X(t) be the total number of particles created up to time t, and let M(t) count the number of alpha particles existing at time t.~~

We want now to find the number of particles present at time t: so we want to compute M(t) given that at the beginning the timer was zero, i.e. M(0) = 0. We moreover condition on the number n of particles emitted up to time t, that is X(t) = n, where W~~1, …, Wn < t are the times when particles were created.~~

~~Then, for each particle emitted, we have that the particle k still exists if and only if W~~~~k + Y~~~~k > t: the sum of its arrival and service times must be greater that the actual time t. Let us introduce the indicator function such that indicates whether a particle still exists at time t:~~

~~\mathbb{1}~~{W~~k + Y~~~~k > t} = \begin{cases} 1 & \text{if } W~~~~k + Y~~~~k > t \ 0 & \text{if } W~~~~k + Y_k < t \end{cases}~~

Summing on all indicator functions corresponding to all particles, we then obtain the probability that the number of existing particles is equal to m, conditioned on the total number of particles created up to time t that is n.

~~P(M(t) = m | X(t) = n) = P(\sum~~~~{k=1}^{n} \mathbb{1}~~{W~~k + Yk > t} = m | X(t) = n)~~

~~Note that P[X~~~~1(t) = n|X(t) = n + m] is the probability of accepting exactly n events from n + m trials, where the success probability of each trial is p, which is given by a Binomial distribution. W~~~~1, …, W_n is the same statistics we would have by dealing with ordered version of i.i.d. random variables in (0,t).~~

{W~~k + Yk > t} does not depend on the order of W~~~~k and we have the condition X(t) = n, the theorem (4.11) allows us to replace the Wi’s with the same number of i.i.d. uniform random variables U~~i’s in the interval (0,t], not facing any issue.
P(\sum~~{k=1}^{n} \mathbb{1}~~{W~~k + Yk > t} = m | X(t) = n) = P(\sum~~~~{k=1}^{n} \mathbb{1}~~{U~~k + Y~~k > t} = m)
\sum~~{k=1}^{n} \mathbb{1}~~{U~~k + Yk > t} = m becomes now independent of the total number of arrivals X(t) = n, since we are already considering it by taking the sum. Moreover both U~~~~k and Y~~~~k are i.i.d., so each of indicator function is a binary random variable independent of all others. The sum of these n indicator function is thus binomial with parameters n and p that is computed as:~~

~~p = P(U~~~~k + Y~~~~k > t) = \frac{1}{t} \int~~~~{0}^{t} P(Y~~~~k > t - u)du = \frac{1}{t} \int~~~~{0}^{t} (1 - G(t - u))du = \frac{1}{t} \int~~~~{0}^{t} (1 - G(z))dz~~

~~Where in the last step we just introduced a new variable z = t - u.~~

~~Now that we have obtained the probability of the binomial distribution we can rewrite relation above as:~~

~~P(M(t) = m | X(t) = n) = {n \choose m} p^m (1-p)^{n-m}~~

In order to remove the condition X(t) = n we marginalize over the distribution of X that we know is Poisson. Given that we have a binomial distribution of parameters (n, p) and n is Poisson distributed itself, if we want to find the unconditional distribution of M(t) we obtain a new Poisson, where the new \eta is then scaled according to p of the binomial. Mathematically:

~~P(M(t) = m) = \sum_{n=m}^{\infty} P(M(t) = m | X(t) = n) P(X(t) = n)~~

~~= \sum~~~~{n=m}^{\infty} {n \choose m} p^m (1-p)^{n-m} \frac{(\eta t)^n e^{-\eta t}}{n!} = \frac{e^{-\eta t} (\eta p t)^{m}}{m!} \sum~~~~{n=m}^{\infty} \frac{(1-p)^{n-m} {\eta t}^{n-m}}{(n - m)!}~~

~~\sum~~~~{n=m}^{\infty} \frac{(1-p)^{n-m} {\eta t}^{n-m}}{(n - m)!} = \sum~~~~{j=0}^{\infty} \frac{(\eta t (1-p))^{j}}{j!} = e^{\eta t (1-p)}~~

~~P(M(t) = m) = \frac{e^{-\eta p t} (\eta p t)^{m}}{m!}, for m = 0, 1, …~~

~~\eta p t = \eta \int_{0}^{t} (1 - G(y))dy~~

~~As t \to \infty becomes the expected service time, since the integrand is [1 - G(y)], that is the tail of the distribution G(y).~~

~~For a generic t, the value of the integral depends on the details of the specific distribution G(y), whereas in the long run it depends only on the mean \mu.~~

~~This implies that in the long run two distributions will converge, despite they are different, if both have the same mean.~~

~~Let us re-write the equation above using inference terminology:~~

~~\int_{0}^{\infty} [1 - G(y)]dy = \frac{1}{\mu}~~

~~\eta p t = \eta \int_{0}^{\infty} [1 - G(y)]dy = \frac{\eta}{\mu}~~

Shot Noise process

~~A Shot Noise process models electrical effects that are produced by the random arrival of electrons to an anode.~~

~~Let assume electrons arrive to an anode according to a Poisson process {X(t);t > 0} of constant rate \eta~~

~~An arriving e^- produces a current whose intensity per unit of time after arrival is given by the impulse response function h(x).~~

~~The intensity of the current I(t) will be the superposition of the impulse response functions, that are generated by electrons arrived up to time t:~~

~~I(t) = \sum~~~~{k=1}^{X(t)} h(t - W~~k)

~~P(I(t) < x) = P(\sum~~~~{k=1}^{X(t)} h(t - W~~~~k) < x) = \sum~~~~{n=0}^{\infty} P(\sum~~~~{k=1}^{X(t)} h(t - W_k) < x | X(t) = n) P(X(t) = n)~~

~~Given their number n:~~

~~\sum~~~~{n=0}^{\infty} P(\sum~~~~{k=1}^{n} h(t - U~~~~k) < x) P(X(t) = n) = \sum~~~~{n=0}^{\infty} P(\sum~~~~{k=1}^{n} h(U~~~~k) < x) P(X(t) = n) = P(\sum~~~~{k=1}^{\infty} h(U~~~~k) < x)~~

~~The expected value of a random sum is the product of the expected value of the number of terms (Poisson distributed), times the common expected value for each term (uniformly distributed).~~

~~E[I(t)] = E[X(t)]E[h(U~~~~k)] = \eta t \frac{1}{t} \int~~~~{0}^{t} h(u)du = \eta \int_{0}^{t} h(u)du~~

~~Var(I(t)) = \eta t Var(h(U~~~~k)) + \eta t E[h(U~~~~k)]^2 = \eta t E[h(U~~~~k)^2] = \eta t \frac{1}{t} \int~~~~{0}^{t} h^2(u)du = \eta \int_{0}^{t} h^2(u)du~~

Binomial theorem

~~Theorem 4.6.1: Let [X(t)] be a Poisson process of rate \eta > 0. Then for 0 <u<t and 0 < k < n,~~

~~P(X(u) = k | X(t) = n) = \frac{n!}{k!(n-k)!} (\frac{u}{t})^k (1 - \frac{u}{t})^{n-k}~~

~~Proof:~~

~~P(X(u) = k | X(t) = n) = \frac{ P(X(u) = k \text{ and } X(t) = n) }{ P(X(t) = n)}~~

~~= \frac{ P(X(u) = k \text{ and } X(t) - X(u) = n-k) }{ P(X(t) = n)}~~

~~= \frac{ \frac{e^{-\eta u} {\eta u}^k}{k!} \frac{e^{-\eta (t-u)} {(\eta (t-u))}^{n-k}}{(n-k)!} }{ \frac{e^{-\eta t} {\eta t}^n}{n!} }~~

~~= \frac{ n! }{ k! (n-k)! } \frac{ u^k (t-u)^{n-k} }{ t^n }~~

Dual version

~~P(X(s) = k | X(t) = n) \text{ 0 < n < k, 0<t<s)~~

~~= P(X(s-t) = k-n) = \frac{ e^{-\eta (s-t)} {\eta (s-t)}^{k-n} }{ (k-n)! }~~

A finite state regular Markov chain has transition probability matrix P = |Pij | and limiting distribution \sigma = |\sigma i|. In the long run, what fraction of the transitions are from a prescribed state k to a prescribed state m?

~~\sigma kPkm = \lim_{n\rightarrow} P(Xn = k , Xn+1 = m)~~

Exercise 4.6.5

~~Here the precise step isn't important, being either n or n + 1, we basically need to compute:~~

~~\lim_{n\rightarrow} P(Xn+1 = j|X0 = i) = \sigma j~~

The third one, in the limit as n \rightarrow \infty is the time shifted version of the probability we had to find in the point 2 ), namely:
\lim~~{n\rightarrow} P(Xn = k , Xn+1 = j|X0 = i) = \lim{n\rightarrow} P(Xn-1 = k , Xn = j|X0 = i) = \sigma kP_{kj}~~
~~Compute P(X1(2) = 1|X1(3) = 2) and P(X1(3) = 2|X1(2) = 1)~~
- ~~Compute P(X1(1) = 1|X1(2) + X2(2) = 3) and P(X1(2) + X2(2) = 3|X1(1) = 1)$$~~

~~Note that in both cases the chain is regular and it does not depend on the initial state.~~

P.A.S.T.A. property

~~How can we be sure that times sampled according to some specific rules, no matter they were upon arrivals or departures, are representative of the long run behaviour of our process?~~

~~Let us define the two probabilities pn(t) and an(t):~~

pn(t) = P{N(t) = n}
an(t) = P{N(t) = n | an arrival occurred just after time t}

pn(t) - seen by an external observer
an(t) - seen by a given user

P.A.S.T.A. stands for Poisson Arrivals See Time Averages. Time averages are the statis-tics seen by an external observer, and in the long run we know that Poisson arrivals will see the same statistics, thus concluding there is no bias