Engineering Statistics - Joint Distributions

5.1. Motivation and Notation

  • Motivation

    • Previously, distributions of individual discrete and continuous random variables were studied.
    • It is crucial to understand the statistical relationships between multiple variables (e.g., dependence or independence of events).
    • Key calculations involve means, variances, and functions of multiple random variables (e.g., XY, X+Y).
  • Notation

    • Random events are denoted without curly braces
    • For example, events are noted as AX = X=a and AY = Y=b.
    • Intersection is denoted with a comma instead of igcap (e.g., A_{XY}=X=a,Y=b).

5.2. Joint and Marginal CDF of Multivariate Random Variables

  • Joint CDF

    • Let Z be an n-dimensional random variable: Z = (Z1, Z2, …, Z_n).
    • Joint cumulative distribution function (CDF) FZ(z) = F{X,Y}(x,y) = P(X \leq x, Y \leq y).
    • Geometrically, it represents the probability of finding random point Z in a certain area.
  • Properties of the Joint CDF

    • 0 \leq F_{X,Y}(x,y) \leq 1.
    • Limiting behavior as variables reach infinity ensures probabilities converge to marginal functions:
    • ext{lim} \rightarrow + ext{infinity} F_{X,Y}(x,y) = P(X \leq x) ext{and} P(Y \leq y).
  • Example Calculation of Joint Probability

    • Finding P(a < X < b, c < Y < d): Use disjoint events to apply the summation rule.

5.3. Joint and Marginal PDFs of Multivariate Random Variables

  • Prerequisites: Understanding of double integrals is essential.

  • Joint PDF

    • A two-dimensional region R can be described, leading to a joint probability density function (PDF) f_{X,Y}(x,y).
    • The joint PDF can be expressed in a double integral form:
      P(a < X < b, c < Y < d) = \inta^b \intc^d f_{X,Y}(x,y) \, dy \, dx.
    • The integration is performed over a relevant region of interest in the xy-plane.
  • Normalization Condition

    • For any joint PDF, it must satisfy: \int{\mathbb{R}} f{X,Y}(x,y) \, dy \, dx = 1.

5.4. Conditional CDF and PDFs

  • Conditional CDF

    • The conditional cumulative distribution function F{X|Y}(x|y) is defined as: F{X|Y}(x|Y=y) = \frac{P(X \leq x, Y=y)}{P(Y=y)}.
  • Conditional PDFs

    • For continuous variables,
      f{X|Y}(x|y) = \frac{f{X,Y}(x,y)}{f_Y(y)}.
    • Independent PDFs imply that, conditional PDFs equal marginal PDFs.

5.5. Mean, Covariance, and Correlation Coefficient

  • Mean of a Random Vector

    • The expected value for a function of a random vector is calculated via a double integral or sum for discrete variables.
  • Covariance

    • Defined as:
      Cov(X,Y) = E[XY] - E[X]E[Y].
    • Properties:
    • Cov(X,Y) = 0 for independent variables.
  • Pearson’s Correlation Coefficient

    • Defined as:
      r{X,Y} = \frac{Cov(X,Y)}{\sigmaX \sigma_Y}.

5.6. Random Samples and Statistics

  • Definition of Random Samples
    • Consist of independent and identically distributed random variables with the same distribution.
  • Statistics
    • Any function of sample values (e.g., mean, total volume, etc.).

5.7. Central Limit Theorem (CLT)

  • CLT Statement
    • Given independent random variables with the same distribution, the normalized mean approaches a normal distribution as sample size increases.
  • Rule of Thumb
    • Typically, for N \geq 30, sample means tend to normality regardless of the original distribution.