Sampling Distribution Notes
Sampling Distribution
Sampling distribution is the probability distribution of a sample statistic or any function where constitute a sample.
Suppose a sample of size is drawn from a finite population of size .
The total number of possible samples is: (say)
For each sample, we can compute some statistic , such as the sample mean or variance .
Example Table
Sample Number | Statistic () | Statistic () | Statistic () |
|---|---|---|---|
1 | |||
2 | |||
3 | |||
… | … | … | … |
k |
If these values of the statistic are arranged in a frequency table, we obtain the sampling distribution of the statistic.
The mean and variance of the sampling distribution are denoted by and , respectively.
The distribution of sample means, sample variances, or any function of sample statistics is known as a sampling distribution.
Purpose of Sampling Distribution
A sample is studied not for its own sake but to infer the characteristics of the population.
Finding exact population parameters is often costly, difficult, or impossible when is large.
Sampling is a practical tool to estimate population parameters easily and efficiently.
Rational Distribution and Sampling Distribution
Rational Distribution: When the standard deviation of a sampling distribution is very small, it is referred to as a rational distribution.
Why Sampling Distribution is called Rational: Because the statistic becomes a reliable (rational) estimator of the population parameter when variability is small.
Importance of Sampling Distribution in Statistics
To infer population parameters through point estimation.
To develop confidence intervals.
To perform hypothesis testing.
Sampling distribution helps determine critical values needed for statistical tests.
Uses of Sampling Distribution
Helps estimate the characteristics of the universe population by examining a small part.
Facilitates inference about population parameters using statistics computed from samples.
Enables construction of confidence intervals and hypothesis testing.
How to Obtain the Distribution of Random Variables
To find the distribution of a function of a random variable, we consider two cases:
Case I: Single random variable
Case II: Several random variables
Case I: Single Random Variable (Graphical Approach)
Let be a continuous random variable with distribution , strictly monotonic.
Let , where is invertible with inverse .
Transformation:
Thus, the density is:
where is the Jacobian .
Case I: Single Random Variable (Algebraic Approach)
CDF Approach:
Differentiating:
By the chain rule:
Case II: Several Random Variables
Suppose are continuous random variables with joint distribution .
Define functions ():
…
If , solve for in terms of : for
Jacobian and Joint Density
Jacobian Matrix:
J = \begin{bmatrix}
\frac{\partial x^1}{\partial z1} & \frac{\partial x^1}{\partial z2} & \cdots & \frac{\partial x^1}{\partial zk} \
\frac{\partial x^2}{\partial z1} & \frac{\partial x^2}{\partial z2} & \cdots & \frac{\partial x^2}{\partial zk} \
… & … & … & … \
\frac{\partial x^k}{\partial z1} & \frac{\partial x^k}{\partial z2} & \cdots & \frac{\partial x^*k}{\partial zk}
\end{bmatrix}
Joint Density:
Multiple Solutions
If there are multiple solutions :
Marginal Distribution: Integrating:
Example Question and Solution
Question: If for 0 < x < 2, find the distribution of .
Step 1: Find the range of
When
When
Thus, 3 < z < 7.
Step 2: Find the relationship between and
From , solve for :
Step 3: Find the new PDF
We use the formula for transforming variables: where .
First, compute :
Now, plug into :
Thus, for 3 < z < 7.
Final Answer: The distribution of is: , 3 < z < 7
Another Example Problem
Given the probability density function: , 0 < x < 2, find the distribution of the random variable: .
Let's clarify Jacobian and Joint Density:
Jacobian: In the context of transforming random variables, the Jacobian (often denoted as ) is a determinant of a matrix of partial derivatives. It's used when you change variables in multiple integrals. Essentially, it accounts for how the 'volume' changes during the transformation. In simpler terms, it helps to correct for the stretching or compressing of the space that occurs when you switch from one set of variables to another.
Joint Density: The joint density function (or joint distribution) describes how multiple random variables behave together. If you have two random variables, and , their joint density tells you the probability density at each point . It's a way of understanding how these variables are related and how they vary in conjunction with each other.
In the equations provided:
The Jacobian Matrix is a matrix of partial derivatives:
J = \begin{bmatrix}
\frac{\partial x^1}{\partial z1} & \frac{\partial x^1}{\partial z2} & \cdots & \frac{\partial x^1}{\partial zk} \
\frac{\partial x^2}{\partial z1} & \frac{\partial x^2}{\partial z2} & \cdots & \frac{\partial x^2}{\partial zk} \
\ldots & \ldots & \ldots & \ldots \
\frac{\partial x^k}{\partial z1} & \frac{\partial x^k}{\partial z2} & \cdots & \frac{\partial x^k}{\partial zk}
\end{bmatrix}
And the Joint Density is calculated as:
Where is the original joint distribution in terms of the original variables, and is the absolute value of the determinant of the Jacobian matrix.