Discrete Probability II

A random variable is a function $f: U \rightarrow \mathbb{R}$ , where $U$ is the sample space.
It assigns a real number to each outcome in the sample space.
The randomness comes from the initial random selection of $x$ from $U$ .
Denoted by capital letters such as $X$ , $Y$ , or $Z$ .

A probability distribution is a function that assigns a probability to each member of a set, such that the probabilities add up to 1.
For a random variable $X$ , each possible value $k$ has a probability, written as $Pr(X = k)$ .
The probabilities of the values of $X$ constitute the probability distribution of $X$ .
Events have probabilities, while random variables have probability distributions.
$Pr(X + Y = k) = \sum_{(i,j): i+j=k} Pr((X = i) \land (Y = j))$
Two random variables $X$ and $Y$ are independent if $Pr((X = x) \land (Y = y)) = Pr(X = x) \cdot Pr(Y = y)$ .

Expectation $E(X)$ of a random variable $X$ is a representative value, also called the expected value or mean.
$E(X) = \sum_{k} k \cdot Pr(X = k)$ , where the sum is over all possible values of $X$ .
$E(c) = c$ if $c$ is a constant.
$E(\alpha X) = \alpha E(X)$ if $\alpha$ is a constant.

The median $m$ of a random variable $X$ is a real number such that $Pr(X \le m) \ge \frac{1}{2}$ and $Pr(X \ge m) \ge \frac{1}{2}$ .

The mode of a random variable $X$ is the value $g$ for which $Pr(X = g)$ is greatest.

Variance $Var(X)$ of a random variable $X$ measures how far its values tend to be from its expected value $\mu = E(X)$ .
$Var(X) = E((X - \mu)^2)$
Standard deviation of $X = \sqrt{Var(X)}$ .
$Var(X) = E(X^2) - \mu^2 = E(X^2) - E(X)^2$
If $X$ and $Y$ are independent, then $Var(X + Y) = Var(X) + Var(Y)$ .

For any random variable $X$ with expectation $\mu$ and variance $\sigma^2$ , and any $t \in \mathbb{R}^+$ , the probability that $X$ is at least $t$ standard deviations away from its mean is at most $\frac{1}{t^2}$ : $Pr(|X - \mu| \ge t\sigma) \le \frac{1}{t^2}$ .

Each integer between $a$ and $b$ inclusive has the same probability, and all other integers have zero probability.
Pr(X = x) = {\begin{array}{ll} \frac{1}{b-a+1}, & \text{if } a \le x \le b; \ 0, & \text{otherwise.} \end{array}
$E(X) = \frac{a+b}{2}$
$Var(X) = \frac{(b - a + 1)^2 - 1}{12}$

A Bernoulli trial is a random experiment with two possible outcomes: success (probability $p$ ) and failure (probability $q = 1-p$ ).
X = {\begin{array}{ll} 1, & \text{with probability } p; \ 0, & \text{with probability } 1-p. \end{array}
$E(X) = p$
$Var(X) = p(1 - p)$
The binomial distribution gives the probability of $k$ successes in $n$ Bernoulli trials: $Pr(Z = k) = \binom{n}{k} p^k (1 - p)^{n-k}$ .
$Z \sim Bin(n, p)$
$E(Z) = np$
$Var(Z) = np(1 - p)$

$Pr(X = k) = e^{-\mu} \frac{\mu^k}{k!}$ , for all $k \in \mathbb{N}_0$ .
$X \sim Poisson(\mu)$
$E(X) = \mu$
$Var(X) = \mu$
The Poisson distribution is often used as an approximation to the Binomial distribution when $n$ is large and $np$ is small.

$Pr(X = k) = (1 - p)^{k-1}p$ , for every $k \in \mathbb{N}$ .
$X \sim Geom(p)$
$E(X) = \frac{1}{p}$
$Var(X) = \frac{1-p}{p^2}$
The geometric distribution has the memoryless property.
If $X \sim Geom(p)$ and $t \in \mathbb{N}$ , then the distribution of $X-t$ given that $X \ge t$ is also geometric with probability $p$ .

Let random variable $Z$ be the number of trials until we have seen each possible outcome at least once.
$E(Z) = nH<em>n$ , where $H</em>n$ is the $n$ -th harmonic number.
$H<em>n \approx \log</em>e n + \gamma$ , where $\gamma$ is the Euler-Mascheroni constant.
$E(Z) \approx n(\log_e n + \gamma)$
$E(Z) \approx n \log_e n$