4.3

4.3 Random Variables

  • A random variable is defined as a real-valued function that operates within the sample space of a random phenomenon.
  • More precisely, a random variable represents a numerical outcome derived from a stochastic event.
  • A statistic can also be classified as a random variable.
  • Upper case letters, typically denoted by X, are conventionally used to represent random variables.

Discrete Random Variable

  • A discrete random variable is characterized by having a finite number of potential values.
  • The probability distribution of a discrete random variable enumerates the values it can assume and the associated probabilities with those values:
    • Values of X:
    • $x1, x2, …, x_k$
    • Probabilities:
    • $p1, p2, …, p_k$

Probability Conditions

  • The following conditions must hold true for the probabilities assigned to a discrete random variable's outcomes:
    • 0 ≤ $p_i$ ≤ 1 for all i (the probability of each individual outcome cannot be negative or greater than 1).
    • The total probability across all possible outcomes sums to 1:
      p<em>1+p</em>2++pk=1p<em>1 + p</em>2 + … + p_k = 1

Example: Coin Tosses

  • Consider performing 4 independent tosses of a fair coin.
  • Let the random variable X represent the number of heads observed in the tosses.
  • Possible outcomes from tossing a fair coin 4 times include:
    • HTTH
    • HTHT
    • HTTT
    • THTH
    • HHHT
    • THTT
    • HHTT
    • HHTH
    • TTHT
    • THHT
    • HTHH
    • TTTT
    • HHHH
    • TTTH
    • TTHH
    • THHH
  • The values of X corresponding to the number of heads are:
    • X = 0 (0 heads)
    • X = 1 (1 head)
    • X = 2 (2 heads)
    • X = 3 (3 heads)
    • X = 4 (4 heads)

Continuous Random Variable

  • A continuous random variable is able to take on any value within an interval of real numbers.
  • The probability distribution for a continuous random variable is described by a density curve.
  • The probability of a specific event for a continuous random variable X is represented by the area under the density curve across the defined values for that event.
    • Represented as:
      P(A)=extAreaunderthedensitycurveforvaluesofXextineventAP(A) = ext{Area under the density curve for values of } X ext{ in event A}

Example: Normal Distribution

  • An initial example of a continuous probability distribution is the normal distribution, which adheres to the properties of:
    • Approximately 68% of the data falls within 1 standard deviation from the mean.
    • About 95% of the data falls within 2 standard deviations from the mean.
    • Roughly 99.7% of the data falls within 3 standard deviations from the mean.
  • These characteristics are fundamental to understanding the behavior of data in normal distributions.

Example: Uniform Distribution

  • Another pertinent example of a continuous probability distribution is the uniform distribution defined on the interval [0, 1], denoted as U[0,1].
  • In this context, various areas under the curve correspond to specific probabilities for certain events:
    • For instance, if the height is defined as 1 across the interval, then the properties of the areas can be summarized as:
    • Area = 0.4
    • Area = 0.5
    • Area = 0.2
    • For specific probability examples, let’s consider:
    • P(0.3extxext0.7)=0.4P(0.3 ext{ ≤ } x ext{ ≤ } 0.7) = 0.4
    • P(X ≤ 0.5 ext{ or } X > 0.8) = 0.8