A random variable is a function f:U→R, where U is the sample space.
It assigns a real number to each outcome in the sample space.
The randomness comes from the initial random selection of x from U.
Denoted by capital letters such as X, Y, or Z.
Probability Distributions
A probability distribution is a function that assigns a probability to each member of a set, such that the probabilities add up to 1.
For a random variable X, each possible value k has a probability, written as Pr(X=k).
The probabilities of the values of X constitute the probability distribution of X.
Events have probabilities, while random variables have probability distributions.
Pr(X+Y=k)=∑(i,j):i+j=kPr((X=i)∧(Y=j))
Two random variables X and Y are independent if Pr((X=x)∧(Y=y))=Pr(X=x)⋅Pr(Y=y).
Expectation
Expectation E(X) of a random variable X is a representative value, also called the expected value or mean.
E(X)=∑kk⋅Pr(X=k), where the sum is over all possible values of X.
E(c)=c if c is a constant.
E(αX)=αE(X) if α is a constant.
Linearity of Expectation
For any random variables X and Y, E(X+Y)=E(X)+E(Y).
If X and Y are independent, then E(XY)=E(X)E(Y).
Median
The median m of a random variable X is a real number such that Pr(X≤m)≥21 and Pr(X≥m)≥21.
Mode
The mode of a random variable X is the value g for which Pr(X=g) is greatest.
Variance
Variance Var(X) of a random variable X measures how far its values tend to be from its expected value μ=E(X).
Var(X)=E((X−μ)2)
Standard deviation of X=Var(X).
Var(X)=E(X2)−μ2=E(X2)−E(X)2
If X and Y are independent, then Var(X+Y)=Var(X)+Var(Y).
Chebyshev’s Inequality
For any random variable X with expectation μ and variance σ2, and any t∈R+, the probability that X is at least t standard deviations away from its mean is at most t21: Pr(∣X−μ∣≥tσ)≤t21.
Uniform Distribution
Each integer between a and b inclusive has the same probability, and all other integers have zero probability.
Pr(X = x) = {\begin{array}{ll} \frac{1}{b-a+1}, & \text{if } a \le x \le b; \ 0, & \text{otherwise.} \end{array}
E(X)=2a+b
Var(X)=12(b−a+1)2−1
Binomial Distribution
A Bernoulli trial is a random experiment with two possible outcomes: success (probability p) and failure (probability q=1−p).
X = {\begin{array}{ll} 1, & \text{with probability } p; \ 0, & \text{with probability } 1-p. \end{array}
E(X)=p
Var(X)=p(1−p)
The binomial distribution gives the probability of k successes in n Bernoulli trials: Pr(Z=k)=(kn)pk(1−p)n−k.
Z∼Bin(n,p)
E(Z)=np
Var(Z)=np(1−p)
Poisson Distribution
Pr(X=k)=e−μk!μk, for all k∈N0.
X∼Poisson(μ)
E(X)=μ
Var(X)=μ
The Poisson distribution is often used as an approximation to the Binomial distribution when n is large and np is small.
Geometric Distribution
Pr(X=k)=(1−p)k−1p, for every k∈N.
X∼Geom(p)
E(X)=p1
Var(X)=p21−p
The geometric distribution has the memoryless property.
If X∼Geom(p) and t∈N, then the distribution of X−t given that X≥t is also geometric with probability p.
Coupon Collector's Problem
Let random variable Z be the number of trials until we have seen each possible outcome at least once.
E(Z)=nH<em>n, where H</em>n is the n-th harmonic number.
H<em>n≈log</em>en+γ, where γ is the Euler-Mascheroni constant.