Hardy-Weinberg Equilibrium: Study Notes

Hardy-Weinberg Equilibrium: Concept and Implications

  • Core idea: In a large population with random mating and no evolutionary forces, allele frequencies stay constant across generations.
  • If there are two alleles at a locus, say A and a, and their frequencies are p and q respectively, then under random mating the genotype frequencies in the next generation follow:
    • P(AA) = p^2\,,\quad P(Aa) = 2pq\,,\quad P(aa) = q^2
  • The sum of allele frequencies is 1: p + q = 1\,, so q = 1 - p\$.
  • If you mix alleles at random (random union of gametes), you would expect those genotype proportions, and that scenario is described by Hardy-Weinberg equilibrium.
  • When observed genotype frequencies match these expected frequencies, the population is said to be in Hardy-Weinberg equilibrium; if they differ, evolution or other factors may be at play.
  • The transcript’s framing: the idea of "no selection" or no evolution leads to random sampling of alleles and the resulting genotype distribution following the p^2, 2pq, q^2 pattern.

Assumptions of the Hardy-Weinberg Model

  • Very large population size (to avoid genetic drift).
  • Random mating (no mating bias or sexual selection).
  • No mutation introducing new alleles.
  • No migration (no gene flow between populations).
  • No natural selection (genotypes have equal viability and fertility).
  • Usually non-overlapping generations (for simple modeling), though the core results can apply more broadly.

Allele and Genotype Frequencies

  • Define:
    • p = \text{frequency of the A allele}
    • q = \text{frequency of the a allele}
  • Relationship:
    • p + q = 1\,, hence q = 1 - p\,.
  • Under random mating, genotype frequencies are:
    • P(AA) = p^2\,
    • P(Aa) = 2pq\,
    • P(aa) = q^2\,
  • Example with numbers: if p = 0.6, q = 0.4 then
    • P(AA) = 0.36\,
    • P(Aa) = 0.48\,
    • P(aa) = 0.16\,

Derivation and Intuition

  • From the allele frequencies in gametes, the formation of zygotes is like drawing two alleles independently:
    • Probability of AA = p\times p = p^2
    • Probability of Aa = p\times q + q\times p = 2pq
    • Probability of aa = q\times q = q^2
  • This is equivalent to expanding the binomial model of two alleles: (p + q)^2 = p^2 + 2pq + q^2 , and using p + q = 1.
  • The key takeaway: genotype frequencies in equilibrium are determined solely by current allele frequencies; they do not depend on initial genotype frequencies, provided assumptions hold.

Testing for Hardy-Weinberg Equilibrium

  • You can test whether a population is in HWE by comparing observed genotype counts to expected counts under HWE.
  • Steps:
    1. Collect observed counts: O{AA}, O{Aa}, O{aa} for a sample size N = O{AA} + O{Aa} + O{aa}.
    2. Estimate allele frequencies from data (from observed counts):
    • \hat{p} = \frac{2\,O{AA} + O{Aa}}{2N}
    • \hat{q} = 1 - \hat{p}
    1. Compute expected counts under HWE:
    • E_{AA} = N \hat{p}^2\,
    • E_{Aa} = N \cdot 2\hat{p}\hat{q}\,
    • E_{aa} = N \hat{q}^2\,
    1. Use a chi-squared test to compare observed vs expected:
    • \chi^2 = \sum{i \in {AA,Aa,aa}} \frac{(Oi - Ei)^2}{Ei}
    1. Degrees of freedom (df): typically 1 for a biallelic locus when p is estimated from data; df = 2 if p and q are fixed a priori.
    2. Decision: compare \chi^2 to the critical value at your chosen significance level (e.g., 0.05). If \chi^2 is larger, reject HWE; if not, fail to reject HWE.
  • Important caveats:
    • Deviations can indicate selection, drift, mutation, migration, nonrandom mating, or sampling/genotyping errors.
    • Small sample sizes or population structure (Wahlund effect) can masquerade as deviation from HWE.

Worked Example and Interpretation (Link to Transcript)

  • Concept: compare observed vs expected frequencies to determine if evolutionary forces are acting.
  • If observed frequencies match the HWE expectations, the population is in Hardy-Weinberg equilibrium for that locus (no evolution affecting that locus under the model).
  • The transcript describes this as: “these two numbers are the same, so this is Hardy-Weinberg.”
  • Example scaffold (numbers can vary):
    • Suppose observed counts: O{AA} = 36,\; O{Aa} = 48,\; O_{aa} = 16 in a sample of N = 100 individuals.
    • Estimate allele frequencies: \hat{p} = \frac{2\cdot 36 + 48}{200} = \frac{144}{200} = 0.72\,, which would imply \hat{q} = 0.28\,.
    • Expected counts: E{AA} = 100 \cdot 0.72^2,\; E{Aa} = 100 \cdot 2 \cdot 0.72 \cdot 0.28,\; E_{aa} = 100 \cdot 0.28^2.
    • Compare with observed, compute \chi^2, and decide if in HWE.
  • Practical note: In many classroom or real-world datasets, you first estimate p from the data, then compute expected counts, then apply the chi-square test as above.

Connections to Foundations and Real-World Relevance

  • Relationship to Mendelian genetics: HWE provides a bridge from allele-level frequencies to genotype-level expectations in populations.
  • Foundational principle: Under no-evolutionary-forces assumptions, allele frequencies do not change across generations, and genotype frequencies stabilize as a function of p and q.
  • Real-world relevance:
    • Used in population genetics, evolutionary biology, and forensic genetics.
    • In human genetics, HWE checks are a standard quality-control step in genotyping data; deviations can flag errors or population structure.
    • Helps infer whether part of the population is experiencing selection, inbreeding, or substructure.

Implications and Practical Considerations

  • If deviations from HWE are detected, consider possible causes:
    • Natural selection: some genotypes confer higher viability/fertility.
    • Nonrandom mating: assortative mating or inbreeding increases homozygosity.
    • Population structure/genetic subdivision: Wahlund effect can reduce heterozygosity.
    • Mutation or migration introducing new alleles.
    • Genotyping errors or sampling biases.
  • Ethical/philosophical/real-world angle:
    • In human studies, deviations can reflect social or historical population structures; careful interpretation is needed to avoid misattributing social factors to biology.
    • The model assumes idealized conditions; real populations often violate one or more assumptions, so HWE serves as a diagnostic baseline rather than a universal law.

Quick Reference: Key Formulas

  • Allele frequencies and sum rule:
    • p + q = 1\,.
  • Genotype frequencies under random mating:
    • P(AA) = p^2\, , \quad P(Aa) = 2pq\, , \quad P(aa) = q^2\,
  • Observed and estimated allele frequency from data:
    • \hat{p} = \dfrac{2\,O{AA} + O{Aa}}{2N}\,, \quad \hat{q} = 1 - \hat{p}\,
  • Expected genotype counts under HWE:
    • E{AA} = N \hat{p}^2\, , \quad E{Aa} = N \cdot 2\hat{p}\hat{q}\, , \quad E_{aa} = N \hat{q}^2\,
  • Chi-squared test statistic:
    • \chi^2 = \dfrac{(O{AA} - E{AA})^2}{E{AA}} + \dfrac{(O{Aa} - E{Aa})^2}{E{Aa}} + \dfrac{(O{aa} - E{aa})^2}{E_{aa}}\,
  • Degrees of freedom: typically df = 1 for a single biallelic locus with allele frequencies estimated from data.

Summary Takeaway

  • Hardy-Weinberg equilibrium provides a baseline expectation for genotype frequencies given allele frequencies, assuming no evolutionary forces.
  • The key test involves comparing observed genotype counts to HW-predicted counts using the chi-square test and interpreting deviations in the context of possible evolutionary forces or data issues.
  • The fundamental formulas are p+q=1\, ,\quad P(AA)=p^2,\; P(Aa)=2pq,\; P(aa)=q^2\,,$$ with practical calculation of p from observed data and chi-square testing to assess equilibrium.