Study Notes on Random Variables and Their Distributions

Chapter 4: Distributions of Random Variables

Author: Leo Zexian Wang


Random Variables

  • Definition: A random variable (r.v.) is a numeric quantity that takes different values with specified probabilities.

  • Types of Random Variables:

    • Discrete Random Variable: Takes values from a discrete set (countable values).

    • Examples can be either countably finite (like the number of students in a classroom) or countably infinite (such as counting the number of trials until a success).

    • Continuous Random Variable: Takes values from a continuous range (e.g., any value in an interval).


Discrete Random Variables

  • Probability Mass Function (pmf): Denoted as f(x)=P(X=x)f(x) = P(X = x), assigns a probability to each possible value xx in support XX.

    • The sum of probabilities must equal 1: extstyleorallxextinX:xextinXf(x)=1extstyle orall x ext{ in } X: \sum_{x ext{ in } X} f(x) = 1.

    • Example:

    • Let XX be the discrete r.v. denoting the number of heads in two successive tosses of a fair coin.

    • The sample space: ext=extTT,TH,HT,HHext{Ω} = ext{TT, TH, HT, HH}

      • P(X=0)=P(TT)=14P(X=0)=P({TT})=\frac14

      • P(X=1)=P(TH,HT)=12P(X=1)=P({TH},{HT})=\frac12

      • P(X=2)=P(HH)=14P(X=2)=P({HH})=\frac14

    • Support: X=0,1,2X={{0, 1, 2}} .

    • Cumulative Distribution Function (cdf): Denoted as F(x)=P(Xx)F(x) = P(X \leq x), describes the probability that XX is less than or equal to xx. F(x) is a non-decreasing function from 0 to 1.

    • Example cdf:

    • F(x)={0amp;xlt;0 0.25amp;0xlt;1 0.75amp;1xlt;2 1amp;x2F(x) = \begin{cases} 0 & x < 0 \ 0.25 & 0 \leq x < 1 \ 0.75 & 1 \leq x < 2 \ 1 & x \geq 2 \end{cases}


Expected Value and Variance of Discrete Random Variables

  • Expected Value E(X)E(X):

    • E(X)=xXxP(X=x)E(X) = \sum_{x \in X} x P(X = x)

    • Example: For the coin toss, E(X)=0×14+1×12+2×14=1E(X) = 0 \times \frac{1}{4} + 1 \times \frac{1}{2} + 2 \times \frac{1}{4} = 1

  • Variance Var(X)Var(X):

    • Var(X)=E((Xμ)2)=xX(xμ)2P(X=x)Var(X) = E((X - \mu)^2) = \sum_{x \in X} (x - \mu)^2P(X = x)

    • This can also be expressed as:

      • Var(X)=E(X2)(E(X))2Var(X) = E(X^2) - (E(X))^2

    • Example from previous scenario:

    • E(X2)=02×14+12×12+22×14=32E(X^2) = 0^2 \times \frac{1}{4} + 1^2 \times \frac{1}{2} + 2^2 \times \frac{1}{4} = \frac{3}{2}

    • Thus, Var(X)=3212=12Var(X) = \frac{3}{2} - 1^2 = \frac{1}{2}

  • Standard Deviation SD(X)SD(X):

    • Given by SD(X)=Var(X)=12SD(X) = \sqrt{Var(X)} = \sqrt{\frac{1}{2}}


Continuous Random Variables

  • Probability Density Function (pdf): Denoted as f(x)f(x); the area under the curve between any two points aa and bb equals the probability that XX falls between them:

    • P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_a^b f(x) \,dx

    • Total area under the curve must equal 1: P(Ω)=xXf(x)dx=1P(\Omega) = \int_{x \in X} f(x) \,dx = 1

  • Example:

    • Let XX be the time it takes for a bus to arrive, following a uniform distribution:

    • Support: X=[10,15]X = [10, 15]

      • f(x)={15amp;10x15 0amp;elsewheref(x) = \begin{cases} \frac{1}{5} & 10 \leq x \leq 15 \ 0 & \text{elsewhere} \end{cases}

    • CDF:

      • F(x)=10xf(y)dyF(x) = \int_{10}^x f(y) \, dy

      • F(x)={0amp;xlt;10 x105amp;10x15 1amp;xgt;15F(x) = \begin{cases} 0 & x < 10 \ \frac{x - 10}{5} & 10 \leq x \leq 15 \ 1 & x > 15 \end{cases}


Expected Value and Variance of Continuous Random Variables

  • Expected Value E(X)E(X):

    • E(X)=xXxf(x)dxE(X) = \int_{x \in X} x f(x) \,dx

    • Substitute values for our example:

    • E(X)=1015x15dx=15[x22]1015=12.5E(X) = \int_{10}^{15} x \cdot \frac{1}{5} \,dx = \frac{1}{5} \left[\frac{x^2}{2}\right]_{10}^{15} = 12.5

  • Variance Var(X)Var(X):

    • Var(X)=E(X2)(E(X))2Var(X) = E(X^2) - (E(X))^2

    • Example calculation:

    • E(X2)=1015x215dx=25E(X^2) = \int_{10}^{15} x^2 \cdot \frac{1}{5} \,dx = 25

    • Variance is thus calculated as: Var(X)=25(12.5)2=2512Var(X) = 25 - (12.5)^2 = \frac{25}{12}

    • Standard deviation: SD(X)=Var(X)=5231.4434SD(X) = \sqrt{Var(X)} = \frac{5}{2\sqrt{3}} \approx 1.4434


Common Distributions of Discrete Random Variables

  • Discrete Uniform Distribution:

    • Each outcome is equally likely.

  • Binomial Distribution:

    • Models the number of successes in nn independent trials (e.g., flipping a coin).

    • Parameters: number of trials nn and probability of success pp.

  • Geometric Distribution:

    • Models the number of trials until the first success occurs.

  • Poisson Distribution:

    • Models the number of events occurring in a fixed interval of time or space, given a known average rate of occurrence.


Common Distributions of Continuous Random Variables

  • Continuous Uniform Distribution:

    • All intervals of the same length are equally probable.

  • Normal Distribution:

    • Bell-shaped curve symmetric about the mean.

    • Standard normal distribution: N(0,1)N(0, 1) has mean 0 and variance 1.

  • Student’s t Distribution:

    • Similar to normal distribution but with heavier tails.

  • Chi-square Distribution:

    • Represents the sum of squared standard normal variables.

  • Exponential Distribution:

    • Models time until an event occurs (like waiting for a bus).


Normal Distribution

  • Definition: A continuous distribution characterized by its mean 0;1;\sigma (standard deviation).

  • Properties:

    • Mean = Median = Mode.

    • f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}} approaches but never touches the x-axis.

    • Changing μ\mu shifts the curve left or right, and changing σ\sigma affects the spread.

  • Standardization: For any normal random variable XX, transform to a standard normal variable ZZ using:

    • Z=XμσZ = \frac{X - \mu}{\sigma}


Calculating Probabilities with Standardization

  • Example: Absorption rate of cones in the eye follows a normal distribution.

  • Given mean of 535 nm and standard deviation of 65 nm, calculate proportion absorbing wavelengths between 550 nm and 575 nm.

    • P(550 < X < 575) = P(X < 575) - P(X < 550)

    • Compute z-scores: Z=XμσZ = \frac{X - \mu}{\sigma}

    • Result: P(Z < 0.62) - P(Z < 0.23) = 0.1414


Z-Table and Finding Probabilities

  • Z-Table values represent cumulative probabilities for corresponding z-scores.

    • Example: To find P(Z < 1.26) identify the area to the left of z = 1.26 in the table (0.8962).

    • To Find P(Z > 1.26) = 1 - P(Z < 1.26) = 0.1038.


Percentiles and Their Calculation

  • Finding Percentiles:

    • To find height in top 10% (e.g., for female heights with mean of 64 inches, sd of 2.5 inches):

    • Determine the corresponding z-score for the cumulative probability: z=1.28z = 1.28

    • Height: x=zσ+μ=1.28(2.5)+64=67.2x = z\sigma + \mu = 1.28(2.5) + 64 = 67.2

  • Example for Hummingbirds:

    • Given weight distribution with mean of 13g and SD of 3.4g, find weight less than 65% of all hummingbirds (35th percentile). Solve with z-score substitution:

    • 0.39-0.39 yields weight = 11.674g11.674g


The Empirical Rule

  • Definitions:

    • Approximately 68% of values lie within one standard deviation of the mean.

    • Approximately 95% of values lie within two standard deviations.

    • Approximately 99.7% of values lie within three standard deviations.

  • Application to ITBS scores:

    • Questions about the proportion of scores within a given range can be answered using the Empirical Rule.


Binomial Distribution Assumptions

  • Parameterization: Binomial(n, p) where nn is the number of independent trials, and pp is the probability of success.

  • Example: With a 5% chance of broken eggs in a dozen:

    • P(X=x)=(nx)px(1p)nxP(X = x) = {n\choose x} p^x (1-p)^{n-x}


Geometric and Poisson Distributions

  • Geometric Distribution: Models trials until the first success. Repeats count as trials.

  • Poisson Distribution: Models the number of occurrences in a fixed time/space.

  • Important definitions and examples include rates (e.g. phone calls/hour) and counts (e.g. bacteria in a sample).


Memoryless Property

  • The condition where future probabilities do not depend on past events is fulfilled by both exponential and geometric distributions.


Student’s t Distribution

  • Definition: Characterized by degrees of freedom, ν\nu.

    • As ν\nu decreases, it becomes more heavy-tailed; approaches normal distribution as ν\nu \to \infty.


Functions and R Commands

  • Practical use in R for generating random variables, calculating mean, probabilities, and densities of distributions. Examples include:

    • Generate normal random variables: rnorm(n=10, mean =0, sd=1).

    • Find probabilities using functions like pnorm, and density with dnorm.


Bivariate Distributions

  • Understanding multi-variable distribution via joint probability mass and density functions and calculating conditional probabilities.

    • Covariance and correlation metrics provide understanding of relationships between random variables.


Bivariate Normal Distribution

  • Joint distributions of two continuous variables, characterized by specific means, variances, and correlations.