Rao-Blackwell Theorem and Minimum-Variance Unbiased Estimator

Rao-Blackwell Theorem and Minimum-Variance Unbiased Estimator (MVUE)

Introduction to Estimators

  • Let θ^\hat{\theta} be an estimator of a parameter θ\theta.

  • Desirable properties of θ^\hat{\theta} include:

    1. Unbiasedness: The estimator should satisfy
      E(θ^)=θE(\hat{\theta}) = \theta.

    2. Consistency: As the sample size nn approaches infinity, the probability that the estimator deviates from the true parameter should diminish, formally,
      \lim_{n \to \infty} P(|\hat{\theta} - \theta| > \epsilon) = 0, \; \forall \epsilon > 0.

    3. Efficiency: The efficiency of θ^\hat{\theta} relative to another unbiased estimator θ^<em>2\hat{\theta}<em>2 should be defined as eff(θ^,θ^</em>2)=V(θ^<em>2)V(θ^)1   unbiased estimator θ^</em>2.\text{eff}(\hat{\theta}, \hat{\theta}</em>2) = \frac{V(\hat{\theta}<em>2)}{V(\hat{\theta})} \geq 1 \; \forall \text{ unbiased estimator } \hat{\theta}</em>2.

  • An estimator satisfying these properties is known as the Minimum Variance Unbiased Estimator (MVUE).

Finding an MVUE of θ\theta

  • Question: How to find an MVUE of θ\theta?

  • Answer: Let X<em>1,,X</em>nX<em>1, …, X</em>n be independent and identically distributed (i.i.d.) random variables from a probability density function (pdf) given by,
    f(x;θ)=expg(θ)+h(x).f(x;\theta) = \exp{{g(\theta) + h(x)}}. If γ^=G(θ^)\hat{\gamma} = G(\hat{\theta}) and γ(θ)\gamma(\theta), then γ^\hat{\gamma} is the MVUE of γ\gamma by Theorem 9.5 (The Rao-Blackwell Theorem).

Rao-Blackwell Theorem

  • Let θ^\hat{\theta} be an unbiased estimator of θ\theta such that the variance V(\hat{\theta}) < \infty.

  • If UU is a sufficient statistic for θ\theta, then define:
    θ^=E(θ^U).\hat{\theta}^{*} = E(\hat{\theta}|U).

  • The properties of this estimator are:

    • For all θ\theta,
      E(θ^)=θE(\hat{\theta}^{*}) = \theta

    • The variance is guaranteed to be less than or equal to that of any unbiased estimator:
      V(θ^)V(θ^).V(\hat{\theta}^{*}) \leq V(\hat{\theta}).

Definition of Sufficient Statistic

  • Definition 9.3: A statistic U=g(Y<em>1,,Y</em>n)U = g(Y<em>1, …, Y</em>n) is sufficient for θ\theta if the conditional distribution of Y<em>1,,Y</em>nY<em>1, …, Y</em>n given UU does not depend on θ\theta. This is formalized as:
    f<em>Y</em>1,,Y<em>nU(y</em>1,,ynu)f<em>{Y</em>1,…,Y<em>n|U}(y</em>1, …, y_n | u) does not depend on θ\theta.

  • Advantages of Sufficient Statistics:

    1. Simplifies data for making inferences about θ\theta.

    2. Leads to the MVUE of θ\theta or a function W(θ)W(\theta).

Likelihood Function and Factorization Theorem

  • Definition 9.4: For sample observations y<em>1,,y</em>ny<em>1, …, y</em>n taken on corresponding random variables Y<em>1,,Y</em>nY<em>1, …, Y</em>n whose distribution depends on parameter θ\theta, the likelihood of the sample is defined as:
    L(y<em>1,,y</em>nθ){ the joint probability of y<em>1,,y</em>namp; if Y is discrete the joint density of y<em>1,,y</em>namp; if Y is continuous random variable.<br>L(y<em>1, …, y</em>n | \theta) \equiv \begin{cases} \prod \text{ the joint probability of } y<em>1, …, y</em>n &amp; \text{ if } Y \text{ is discrete} \ \text{the joint density of } y<em>1, …, y</em>n &amp; \text{ if } Y \text{ is continuous random variable.}<br>\end{cases}

  • For simplicity, we may write:
    L(θ)=L(y<em>1,,y</em>nθ)=L(y^θ).L(\theta) = L(y<em>1, …, y</em>n | \theta) = L(\hat{y} | \theta).

  • Theorem 9.4: A statistic UU based on the random sample Y<em>1,,Y</em>nY<em>1, …, Y</em>n is sufficient for estimating θ\theta if the likelihood function can be expressed in a factored form:
    L(θ)=g(u,θ)h(y<em>1,,y</em>n)L(\theta) = g(u, \theta) h(y<em>1, …, y</em>n) where g(u,θ)g(u, \theta) is a function only of uu and θ\theta and h()h() does not depend on θ\theta.

Examples

Example 9.6: Binomial Distribution
  • Let YBin(m,p)Y \sim Bin(m, p). We check if p^=Ym\hat{p} = \frac{Y}{m} is an MVUE of pp:

    1. YY is sufficient.

    2. Compute:

    • E(p^)=E(Ym)=1mE(Y)=pE(\hat{p}) = E\left(\frac{Y}{m}\right) = \frac{1}{m}E(Y) = p.

  • Thus, p^\hat{p} is the MVUE of pp.

Example 9.7: Rayleigh Distribution
  • Suppose Y<em>1,,Y</em>nY<em>1, …, Y</em>n i.i.d. from the distribution given by:
    f = 2y \theta e^{-y^2 / \theta}, \, y > 0.

  • To find the MVUE of θ\theta:

    • Use a similar approach as in this section. For instance, let n=1n = 1 and show that Y2Y^2 is a sufficient statistic.

  • Calculate expectation as necessary.

Example 9.8: Normal Distribution
  • Suppose X<em>1,,X</em>nX<em>1, …, X</em>n i.i.d. from N(μ,σ2)N(\mu, \sigma^2):

    • Formulate the likelihood as:
      f(x)=exp(T(x)ψ(p)+g(p)+h(x))f(x) = \exp\left(T(x) \psi(p) + g(p) + h(x)\right):
      f=12πσ2e(xμ)22σ2.f = \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}.

    • Identify sufficient statistics and check for MVUEs by finding expectations.

MVUE of $Cursive Variances$

  • The MVUE of σ2\sigma^2 may involve estimates derived from sufficient statistics.

    • There is a relationship between sample variance and unbiased estimators.

Conclusion

  • Understanding the Rao-Blackwell theorem greatly aids in the identification and calculation of MVUEs for various statistical models and distributions. The properties of sufficient statistics are pivotal in simplifying the process of estimator verification and validation.