Comprehensive Notes on Discrete Random Variables and Probability Distributions

Overview of Random Variables and Probability Distributions

  • The study of probability is divided into discrete and continuous distributions, primarily covered in chapters 5 and 6 respectively.
  • Chapter 5: Discrete Probability Distributions introduces the concept of random variables, the classification into discrete and continuous, and specific discrete distributions including binomial, hypergeometric, and Poisson.
  • Chapter 6: The Normal Probability Distribution introduces probability distributions for continuous random variables, focusing on the exponential and normal distributions. The normal distribution is cited as the most important probability distribution used in statistics.

Concept and Definition of a Random Variable

  • Results of random experiments are represented by outcomes in a sample space SS. These outcomes are not necessarily numbers (they may be letters or words).
  • It is convenient to represent these outcomes as numbers by associating each outcome from the sample space with a numerical value according to a clearly defined rule.
  • Formal Definition: When each outcome from the sample space is associated with a unique number according to a clearly defined rule, this rule is considered a function defined on the sample space. This function is called a random variable.
  • Variables in Context:
    • In algebra, variables are denoted by lowercase letters (a,b,c,x,y,za, b, c, x, y, z).
    • Random variables are denoted by uppercase letters (most often X,Y,ZX, Y, Z).
    • Particular values assumed by random variables are denoted by lowercase letters (x,y,zx, y, z).
    • The statement that a random variable XX takes a value xx is written as X=xX = x.
  • The Range: The set of all possible numbers that a random variable can take is called the range of the random variable, denoted by RR. If specifying for XX, it is RXR_X.

Examples of Mapping Outcomes to Random Variables

  • Example 1: Tossing a Coin Three Times (n=3n=3)
    • Sample Space S={ttt,htt,tht,tth,hht,hth,thh,hhh}S = \{ttt, htt, tht, tth, hht, hth, thh, hhh\}.
    • Variable XX (Heads on the first toss):
      • For outcomes ttt,tht,tth,thhttt, tht, tth, thh, X=0X = 0 (first toss is tails).
      • For outcomes htt,hht,hth,hhhhtt, hht, hth, hhh, X=1X = 1 (first toss is heads).
      • Range of XX: RX={0,1}R_X = \{0, 1\}.
    • Variable YY (Number of heads in three tosses):
      • ttt0ttt \rightarrow 0
      • htt,tht,tth1htt, tht, tth \rightarrow 1
      • hht,hth,thh2hht, hth, thh \rightarrow 2
      • hhh3hhh \rightarrow 3
      • Range of YY: RY={0,1,2,3}R_Y = \{0, 1, 2, 3\}.
  • Example 2: Tossing a Pair of Dice
    • Sample Space SS consists of 36 pairs of numbers.
    • Variable ZZ (Sum of numbers):
      • Outcome (1,1)2(1, 1) \rightarrow 2
      • Outcomes (2,1),(1,2)3(2, 1), (1, 2) \rightarrow 3
      • …up to (6,6)12(6, 6) \rightarrow 12
      • Range of ZZ: RZ={2,3,4,5,6,7,8,9,10,11,12}R_Z = \{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12\}.

Classification of Random Variables

  • Discrete Random Variable: A random variable that assumes a finite number of possible values or a countably infinite number of values (it remains countable even if the sequence continues indefinitely, such as tossing a coin until the first head appears).
    • Examples: Number of customers at a bank in an hour, number of defective light bulbs in a box of 100.
  • Continuous Random Variable: A random variable whose values are real numbers and which assumes an infinite number of values (a continuum) within any interval.
    • There are no gaps between the values.
    • Intervals can be finite (e.g., [0,1][0, 1]), semi-infinite, or infinite.
    • Examples: Height or weight of a person, length of an object, time required to perform a task. Values are typically obtained via measurement and can theoretically be represented with infinite decimal accuracy.

Discrete Probability Distributions

  • Definition: A discrete probability distribution consists of the list of all possible values of a discrete random variable together with their corresponding probabilities.
  • Probability Mass Function (PMF): The function f(x)f(x) or fX(x)f_X(x) represents the probability that the random variable XX is equal to the value xx.
    • Notation: f(x)=P(X=x)f(x) = P(X = x).
  • Properties of PMF:
    1. Every probability value must be between 0 and 1 inclusive: 0f(xi)10 \le f(x_i) \le 1.
    2. The sum of all possible values of the PMF must equal 1: i=1nf(xi)=1\sum_{i=1}^{n} f(x_i) = 1.
  • Interval Statements for Discrete Variables:
    • Inequalities can describe intervals, e.g., P(X>2)P(X > 2), P(X2)P(X \le 2), or double inequalities like P(5X10)P(5 \le X \le 10).

Representation of Discrete Probability Distributions

  • Tabular Form: Outcomes listed in one column/row (usually in ascending order) with their corresponding probabilities f(x)f(x) in the next.
  • Graphical Form: Values represented on the horizontal XX-axis. Corresponding probabilities are shown as vertical segments or bars of unit width with height equal to f(xi)f(x_i).
  • Formula Form: A mathematical rule used to find f(x)f(x) for each value in the range.

Numerical Descriptive Measures: Mean and Variance

  • Mean (Expected Value): A measure of the center of the distribution.
    • Notation: μX\mu_X or E(X)E(X).
    • Formula: μ=E(X)=all xxf(x)\mu = E(X) = \sum_{\text{all } x} x f(x).
  • Variance: A measure of variability or spread.
    • Notation: σX2\sigma_X^2, V(X)V(X), or Var(X)\text{Var}(X).
    • Definition Formula: σ2=V(X)=all x(xμ)2f(x)\sigma^2 = V(X) = \sum_{\text{all } x} (x - \mu)^2 f(x).
    • Calculation Formula: σ2=all xx2f(x)μ2\sigma^2 = \sum_{\text{all } x} x^2 f(x) - \mu^2.
  • Standard Deviation: The positive square root of the variance.
    • Notation: σX=V(X)\sigma_X = \sqrt{V(X)}.

Case Study: Expected Gain and Insurance

  • Lottery Gain Example: 8000 tickets at $10\$10 each. Prize is a $24,000\$24,000 car. If you buy two tickets:
    • Value 1 (Loss): $20+0=$20-\$20 + 0 = -\$20
    • Value 2 (Win): $20+$24,000=$23,980-\$20 + \$24,000 = \$23,980
    • P(X=23,980)=28000P(X = 23,980) = \frac{2}{8000}; P(X=20)=79988000P(X = -20) = \frac{7998}{8000}.
    • E(X)=(20)79988000+(23,980)28000=$14E(X) = (-20) \cdot \frac{7998}{8000} + (23,980) \cdot \frac{2}{8000} = -\$14 (expected loss of $14\$14).
  • Insurance Policy Pricing:
    • Policy: $100,000\$100,000. Cancellation probability: 0.020.02.
    • If expected gain for company is $0\$0:
      • E(X)=C(0.98)+(C100,000)(0.02)=0C=$2000E(X) = C(0.98) + (C - 100,000)(0.02) = 0 \rightarrow C = \$2000.
    • If expected gain for company is $500\$500:
      • E(X)=C2000=500C=$2500E(X) = C - 2000 = 500 \rightarrow C = \$2500.

Discrete Uniform Distribution

  • Occurs when each of nn possible values has the same probability of occurring.
  • Probability Mass Function: f(xi)=1nf(x_i) = \frac{1}{n} for i=1,2,...,ni = 1, 2, ..., n.
  • Example: Rolling a six-sided balanced die. Range is {1, 2, 3, 4, 5, 6}\text{\{1, 2, 3, 4, 5, 6\}}, n=6n = 6, and f(x)=16f(x) = \frac{1}{6}.
  • Descriptive Measures for Die Roll:
    • μ=(1+2+3+4+5+6)16=3.5\mu = (1+2+3+4+5+6) \cdot \frac{1}{6} = 3.5.
    • σ2=x2f(x)μ2=1.707\sigma^2 = \sum x^2 f(x) - \mu^2 = 1.707.

Bernoulli and Binomial Distributions

  • Bernoulli Experiment: A random experiment with a single trial resulting in two outcomes: Success (SS) and Failure (FF).
    • P(S)=pP(S) = p, P(F)=1p=qP(F) = 1 - p = q.
    • Bernoulli Random Variable XX: X=1X = 1 for success, X=0X = 0 for failure.
    • μ=p\mu = p; σ2=p(1p)\sigma^2 = p(1-p).
  • Binomial Experiment: A sequence of nn independent Bernoulli trials.
    • Trials are identical, outcomes are binary (S/F), and pp remains constant across trials.
    • Binomial Random Variable XX: The number of successes in nn independent trials.
    • Ranges from 00 to nn.
    • Probability Mass Function: P(X=x)=nCxpx(1p)nxP(X = x) = nCx \cdot p^x (1 - p)^{n-x}.
    • Binomial Coefficient: nCx=n!x!(nx)!nCx = \frac{n!}{x!(n - x)!}.
  • Symmetry of Binomial Graph:
    • If p=0.5p = 0.5, the distribution is symmetric.
    • If p<0.5p < 0.5, it is skewed to the right.
    • If p>0.5p > 0.5, it is skewed to the left.
  • Descriptive Measures:
    • Mean: μ=np\mu = np.
    • Variance: σ2=np(1p)\sigma^2 = np(1 - p).

Hypergeometric Probability Distribution

  • Used when sampling without replacement from a finite population of size NN where the conditions for binomial distribution (independence and constant pp) are not met.
  • The population contains MM successes and NMN - M failures.
  • Random variable XX: Number of successes in a sample of size nn.
  • Probability Mass Function: P(X=x)=CxMCnxNMCnNP(X = x) = \frac{C_x^{M} \cdot C_{n - x}^{N - M}}{C_n^N}.
  • Mean: μ=nMN\mu = n \cdot \frac{M}{N}.
  • Variance: σ2=nMN(1MN)NnN1\sigma^2 = n \cdot \frac{M}{N} \cdot (1 - \frac{M}{N}) \cdot \frac{N - n}{N - 1}.
    • The term NnN1\frac{N - n}{N - 1} is called the correction factor.
  • Comparison to Binomial:
    • If n/N0.05n / N \le 0.05 and n/(NM)0.05n / (N - M) \le 0.05, the binomial can approximate sampling without replacement. If these are not satisfied, Hypergeometric must be used.

Poisson Probability Distribution

  • Used for experiments observing the number of occurrences of a specified event over a time interval or within a region of space.
  • Assumptions:
    1. Events occur randomly.
    2. Events occur independently.
    3. Events occur at a steady average rate (λ\lambda or μ\mu).
  • The number of events XX is a random variable that follows the Poisson distribution.
  • Probability Mass Function: P(X=x)=eλλxx!P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}.
    • e2.71828e \approx 2.71828.
  • Parameters:
    • Mean: μ=λ\mu = \lambda.
    • Variance: σ2=λ\sigma^2 = \lambda.
  • Relationship to Interval Size: The average rate λ\lambda is proportional to the length of the time interval or size of the space region. For example, if rate is 15/hour15 \text{/hour}, the rate for 15 minutes15 \text{ minutes} is 3.753.75.

Poisson Approximation to the Binomial

  • The Poisson distribution provides a good approximation to the binomial distribution under specific conditions:
    1. The number of trials nn is large (n100n \ge 100).
    2. The probability of success pp is small (p0.01p \le 0.01).
    3. The mean np<5np < 5.
  • In this case, λ=np\lambda = np.
  • Example: Bad serum reaction with p=0.001p = 0.001 and n=2000n = 2000. Here λ=2000×0.001=2\lambda = 2000 \times 0.001 = 2. Probability of exactly 3 reactions: P(X=3)=e2233!0.1804P(X=3) = \frac{e^{-2} \cdot 2^3}{3!} \approx 0.1804.