Hardy-Weinberg Proportions Testing Notes
Testing for Hardy-Weinberg Proportions (HWP)
- Background: Purpose is to test whether observed genotype frequencies in a sample are consistent with expectations under Hardy-Weinberg proportions.
- Key idea: Use observed genotypes to infer allele frequencies, then compute expected genotype counts under HW equilibrium, and compare to observed counts with a chi-square test.
Steps for Testing Hardy-Weinberg Proportions
- (1) determine allele frequencies from observed genotypes
- (2) calculate expected genotype numbers under the model
- (3) compare observed numbers to expected numbers using the chi-square test
Allele Frequencies
- In the example, allele frequencies are given as:
- p = 0.478 (frequency of the A allele)
- q = 0.522 (frequency of the a allele)
- Total sample size used in the example: N = 23
- Under Hardy-Weinberg expectations, the genotype frequencies are determined by p and q with the relation p + q = 1.
Expected Genotype Numbers under HW Proportions
- Under HW assumptions, the expected counts are:
- N_{AA}^{\text{exp}} = p^2 \cdot N
- N_{Aa}^{\text{exp}} = 2pq \cdot N
- N_{aa}^{\text{exp}} = q^2 \cdot N
- In the worked example with p = 0.478\, and q = 0.522\,, and N = 23:
- N_{AA}^{\text{exp}} = (0.478)^2 \times 23 = 5.3
- N_{Aa}^{\text{exp}} = (2\cdot 0.478 \cdot 0.522) \times 23 = 11.5
- N_{aa}^{\text{exp}} = (0.522)^2 \times 23 = 6.3
Chi-Square Test and Null Hypothesis
- Chi-square test measures "goodness of fit" between observed and expected values.
- Null hypothesis: observed values are no different from expected values.
- The chi-square test can reject the null hypothesis, but it cannot prove a hypothesis.
- Calculation:
- \chi^2 = \sumi \frac{(Oi - Ei)^2}{Ei} where the sum is over the genotypes (AA, Aa, aa).
- Degrees of freedom (df) for the chi-square test in this context:
- \text{df} = \text{(# possible genotypes)} - \text{(# alleles)}\,,
- Here, #genotypes = 3 and #alleles = 2, so \text{df} = 1.
- Determine a p-value using the chi-square value and df:
- The p-value is the probability that the observed deviation could occur by chance under the null hypothesis.
Degrees of Freedom and Critical Value
- For the Hardy-Weinberg chi-square test with df = 1, the critical value at P = 0.05 is:
- \chi^2_{0.05,\,df=1} = 3.84
- Interpretation: If \chi^2_{\text{obs}} > 3.84, reject the null hypothesis at the 5% significance level.
- The accompanying P-value corresponds to the tail probability beyond the observed value for df = 1.
Rhino (Rhino Locus) Example
- Observed chi-square value: \chi^2 = 0.05
- Critical value at df = 1: 3.84
- Since \chi^2 = 0.05 \ll 3.84, we do not reject the null hypothesis.
- Conclusion: The rhino population at this locus is consistent with Hardy-Weinberg proportions (HWP).
- Reported P-value: P = 0.82 (actual probability that the observed deviation occurred by chance).
Key Takeaways and Concepts
- The testing procedure involves three steps: estimate allele frequencies, compute HW expectation, and test with a chi-square approximation.
- The null hypothesis states there is no difference between observed and expected values under HW equilibrium.
- A non-rejection of the null (e.g., Rhinos: P = 0.82) suggests that the locus is in Hardy-Weinberg proportions for the sample.
- A rejection of the null would indicate deviation from HW proportions, suggesting possible factors such as non-random mating, selection, drift, mutation, or migration (not detailed in the transcript, but commonly discussed in HW contexts).
- The degrees of freedom for the chi-square test in this context are determined by the number of genotype classes minus the number of alleles: \text{df} = 3 - 2 = 1.
- The chi-square statistic assesses goodness of fit by comparing observed counts to their HW-based expected counts; the p-value quantifies the likelihood of observing such a deviation by chance.
- Expected genotype counts under HW:
- N_{AA}^{\text{exp}} = p^2 \cdot N
- N_{Aa}^{\text{exp}} = 2 p q \cdot N
- N_{aa}^{\text{exp}} = q^2 \cdot N
- Chi-square statistic:
- \chi^2 = \sumi \frac{(Oi - Ei)^2}{Ei}
- Degrees of freedom:
- \text{df} = \text{(# genotypes)} - \text{(# alleles)} = 3 - 2 = 1
- Critical value at alpha = 0.05 for df = 1:
- Interpretation of p-value:
- The p-value is the probability that the observed deviation from expectation occurred by chance, given the null hypothesis.
- Example values from the transcript:
- In the HW example: N{AA}^{\text{exp}} = 5.3,\; N{Aa}^{\text{exp}} = 11.5,\; N_{aa}^{\text{exp}} = 6.3
- Rhino example: \chi^2 = 0.05,\; P = 0.82, do not reject the null.