Hardy-Weinberg Proportions Testing Notes

Testing for Hardy-Weinberg Proportions (HWP)

  • Background: Purpose is to test whether observed genotype frequencies in a sample are consistent with expectations under Hardy-Weinberg proportions.
  • Key idea: Use observed genotypes to infer allele frequencies, then compute expected genotype counts under HW equilibrium, and compare to observed counts with a chi-square test.

Steps for Testing Hardy-Weinberg Proportions

  • (1) determine allele frequencies from observed genotypes
  • (2) calculate expected genotype numbers under the model
  • (3) compare observed numbers to expected numbers using the chi-square test

Allele Frequencies

  • In the example, allele frequencies are given as:
    • p = 0.478 (frequency of the A allele)
    • q = 0.522 (frequency of the a allele)
  • Total sample size used in the example: N = 23
  • Under Hardy-Weinberg expectations, the genotype frequencies are determined by p and q with the relation p + q = 1.

Expected Genotype Numbers under HW Proportions

  • Under HW assumptions, the expected counts are:
    • N_{AA}^{\text{exp}} = p^2 \cdot N
    • N_{Aa}^{\text{exp}} = 2pq \cdot N
    • N_{aa}^{\text{exp}} = q^2 \cdot N
  • In the worked example with p = 0.478\, and q = 0.522\,, and N = 23:
    • N_{AA}^{\text{exp}} = (0.478)^2 \times 23 = 5.3
    • N_{Aa}^{\text{exp}} = (2\cdot 0.478 \cdot 0.522) \times 23 = 11.5
    • N_{aa}^{\text{exp}} = (0.522)^2 \times 23 = 6.3

Chi-Square Test and Null Hypothesis

  • Chi-square test measures "goodness of fit" between observed and expected values.
  • Null hypothesis: observed values are no different from expected values.
  • The chi-square test can reject the null hypothesis, but it cannot prove a hypothesis.
  • Calculation:
    • \chi^2 = \sumi \frac{(Oi - Ei)^2}{Ei} where the sum is over the genotypes (AA, Aa, aa).
  • Degrees of freedom (df) for the chi-square test in this context:
    • \text{df} = \text{(# possible genotypes)} - \text{(# alleles)}\,,
    • Here, #genotypes = 3 and #alleles = 2, so \text{df} = 1.
  • Determine a p-value using the chi-square value and df:
    • The p-value is the probability that the observed deviation could occur by chance under the null hypothesis.

Degrees of Freedom and Critical Value

  • For the Hardy-Weinberg chi-square test with df = 1, the critical value at P = 0.05 is:
    • \chi^2_{0.05,\,df=1} = 3.84
  • Interpretation: If \chi^2_{\text{obs}} > 3.84, reject the null hypothesis at the 5% significance level.
  • The accompanying P-value corresponds to the tail probability beyond the observed value for df = 1.

Rhino (Rhino Locus) Example

  • Observed chi-square value: \chi^2 = 0.05
  • Critical value at df = 1: 3.84
  • Since \chi^2 = 0.05 \ll 3.84, we do not reject the null hypothesis.
  • Conclusion: The rhino population at this locus is consistent with Hardy-Weinberg proportions (HWP).
  • Reported P-value: P = 0.82 (actual probability that the observed deviation occurred by chance).

Key Takeaways and Concepts

  • The testing procedure involves three steps: estimate allele frequencies, compute HW expectation, and test with a chi-square approximation.
  • The null hypothesis states there is no difference between observed and expected values under HW equilibrium.
  • A non-rejection of the null (e.g., Rhinos: P = 0.82) suggests that the locus is in Hardy-Weinberg proportions for the sample.
  • A rejection of the null would indicate deviation from HW proportions, suggesting possible factors such as non-random mating, selection, drift, mutation, or migration (not detailed in the transcript, but commonly discussed in HW contexts).
  • The degrees of freedom for the chi-square test in this context are determined by the number of genotype classes minus the number of alleles: \text{df} = 3 - 2 = 1.
  • The chi-square statistic assesses goodness of fit by comparing observed counts to their HW-based expected counts; the p-value quantifies the likelihood of observing such a deviation by chance.

Formulas to Remember

  • Expected genotype counts under HW:
    • N_{AA}^{\text{exp}} = p^2 \cdot N
    • N_{Aa}^{\text{exp}} = 2 p q \cdot N
    • N_{aa}^{\text{exp}} = q^2 \cdot N
  • Chi-square statistic:
    • \chi^2 = \sumi \frac{(Oi - Ei)^2}{Ei}
  • Degrees of freedom:
    • \text{df} = \text{(# genotypes)} - \text{(# alleles)} = 3 - 2 = 1
  • Critical value at alpha = 0.05 for df = 1:
    • \chi^2_{0.05,1} = 3.84
  • Interpretation of p-value:
    • The p-value is the probability that the observed deviation from expectation occurred by chance, given the null hypothesis.
  • Example values from the transcript:
    • In the HW example: N{AA}^{\text{exp}} = 5.3,\; N{Aa}^{\text{exp}} = 11.5,\; N_{aa}^{\text{exp}} = 6.3
    • Rhino example: \chi^2 = 0.05,\; P = 0.82, do not reject the null.