DS

Genomes and Population Genetics Notes

Learning Outcomes

  • Design strategies to identify genomic variations between individuals that influence a particular trait.

  • Explain how changes in genome composition can lead to evolution and even speciation.

  • Demonstrate how population allele frequencies can be changed by natural selection, genetic drift, or gene flow.

  • Calculate the distribution of gene alleles within a population using the Hardy-Weinberg equation.

Population Genetics: Wilson Disease

  • Hardy-Weinberg Equilibrium (HWE) equations can be used to:

    • Calculate allele frequencies.

    • Predict the frequency of carriers in a population.

    • Predict the number of affected children.

  • The frequency of Wilson disease is commonly reported as 1 in 30,000 people (though some recent estimates are as high as 1 in 7,000).

Allele and Genotype Frequencies

  • Assume a single mutation in ATP7B is responsible for all disease cases.

  • Wild type (dominant) allele: P, with frequency p.

  • Mutant (recessive) allele: Q, with frequency q.

  • Because there are only two alleles:

    • Allele frequencies: p + q = 1

    • Genotype frequencies: p^2 (PP homs) + 2pq (PQ hets) + q^2 (QQ homs) = 1

Hardy-Weinberg and Genotype Frequencies

  • The frequency of the aa genotype under Hardy-Weinberg is the probability of having both an a sperm (probability q) and an a egg (also q): q^2.

  • If Hardy-Weinberg conditions are met, can compute the frequencies of the three possible genotypes:

    • Genotypes: AA, Aa, aa

    • Genotype frequencies: p^2, 2pq, q^2

  • This simple relationship allows us to translate between allele frequencies and genotype frequencies.

Questions about Wilson Disease

  • Q1. What is q^2 for Wilson disease? (Given the frequency is 1 in 30,000)

  • Q2. What does q equal?

  • Q3. What does p equal?

  • Q4. How many Wilson disease heterozygotes (carriers) would be predicted to be present in a unit (e.g. BIO1011) with 1450 students?

  • Q5. What is the probability that a couple from BIO1011 (they met at a workshop) are both carriers of Wilson disease?

Common Mendelian Diseases

  • Frequencies of various diseases (1 in N):

    • a-1-Antitrypsin deficiency: 13.1

    • Cystic fibrosis: 27.8

    • DFNB1: 42.6

    • Spinal muscular atrophy: 57.1

    • Familial Mediterranean fever: 64.2

    • Smith-Lemli-Opitz syndrome: 68.2

    • Sickle cell disease/ß-thalassemia: 69.6

    • Gaucher disease: 76.7

    • Factor XI deficiency: 92.0

    • Achromatopsia: 97.5

ATP7B Mutations

  • Geographic distribution of ATP7B mutations shows variability across different populations, including Europe, Asia, and the Americas.

  • Examples of mutations include p.H1069Q, p.G710S, p.M769fs, and others.

High Population Frequencies of Mutant Alleles

  • High population frequencies of one particular mutant allele can be due to:

    • Founder effects: a small population migrating to a new geographical region.

    • Bottleneck effects: a drastic reduction in population size due to disease or geographical isolation.

    • Heterozygote advantage: while the rare homozygotes may be sick and at a severe disadvantage, the more common heterozygotes may have a selective advantage that maintains the mutant alleles in a population.

  • Question: Given what we know about ATP7B / Wilson disease / copper metabolism, what is one theoretical advantage Wilson disease carriers might have over genotypically wild type individuals?

Evolution in Action

  • Simulate evolutionary processes by enacting natural selection, random mating, and gene flow.

Hardy-Weinberg Equilibrium Assumptions

  • Five assumptions underlie Hardy-Weinberg equilibrium. Genotypes will stay in H-W equilibrium only if:

    • The population is very large.

    • There is no gene flow.

    • There is no natural selection.

    • There is no mutation.

    • There is random mating.

  • If any of these do not apply then allele and genotype frequencies will change – microevolution.

  • The mechanisms that most commonly alter allele frequencies are due to violations of conditions 1-3.

Allele and Genotype Frequencies (Example)

  • 1 gene, 2 alleles: D and d

    • Allele frequency: p = freq(D) = 0.7, q = freq(d) = 0.3

    • p + q = 1 (if 3 alleles, then p + q + r = 1)

  • Random mating in a large population, 1000 (diploid) individuals = 2000 alleles

    • # D alleles = 0.7 x 2000 = 1400

    • # d alleles = 0.3 x 2000 = 600

  • Expected Genotype frequency: DD + Dd + dd which is p^2 + 2pq + q^2 = 1

    • (0.7 \, x \, 0.7 = 0.49) + (2 \, x \, 0.7 \, x \, 0.3 = 0.42) + (0.3 \, x \, 0.3 = 0.09) = 1

  • Expected Genotype numbers = frequency x total (e.g. 1000 progeny)

    • 0.49 \, x \, 1000 = 490 + 0.42 \, x \, 1000 = 420 + 0.09 \, x \, 1000 = 90 = 1000

Rock Pocket Mouse

  • Rock Pocket Mouse example demonstrating evolution in action.

  • Genotypes: dd and D_ (DD or Dd)

Rock Pocket Mouse Questions

  • Q6A. On the light soil background there should be selection against mice with the genotype(s)

  • and therefore the ___ allele should increase in frequency.

  • Q6B. On the dark soil background there should be selection against mice with the genotype(s) _ and therefore the allele should increase in frequency.

  • Q6C (adv) Will selection be more efficient on the light or dark soil habitat? Why?

Class Simulations Overview

  • Previously, the class performed three sequential simulations, which demonstrated environmental factors that can affect the evolution of a species.

  • In this activity, one side of the room had a light, sandy soil environment and the other side of the room had a dark lava soil.

Student Roles in Simulations

  • The students played various roles, either a mouse or a predatory owl.

  • Owls hunted the mice

Simulation 1: Natural Selection

  • Starting with equal numbers of D and d alleles in our ‘population’, Generation 0 mice were given two alleles.

  • p (freq of D allele) = 0.5

  • q (freq of d allele) = 0.5

  • Random distribution of alleles gives 1DD: 2Dd: 1dd distribution

Chi-Square Test

  • Expect. = p^2 \, x \, total

  • Expect. = 2pq \, x \, total

  • Expect. = q^2 \, x \, total

  • \Sigma[(Obs – Exp)^2 \, / \, E]

  • Complicated!! Difference between expected and observed is not statistically significant if p > 0.05

Simulation 1: Key Questions

  • Which allele will increase? (D or d)

  • Will there be a deviation from HWE?

  • If yes to deviation, which genotype(s) would be overrepresented? Why?

  • Focus on the population living on light coloured soil

Simulation 1: Owl Predation Rules

  • Gen 0 mice held their alleles visible and remained standing unless they were “killed” by an owl

  • Owls - “killed” mice based on the rules for hunting:

    • 4 mice per owl (taking their allele cards)

    • Preferentially killed non-camouflaged mice (i.e. DD or Dd in Sandy environment; dd in Lava environment)

Simulation 1: Natural Selection - Results

  • Sandy soil: kill off 20 mice, predominantly non-camouflaged DD and Dd.

  • p = [(2xDD + Dd) ÷ (2x total)]

  • q = [(2xdd + Dd) ÷ (2x total)]

  • HWE assumes NO SELECTION

  • ALLOWING SELECTIVE PREDATION DISRUPTS HWE

  • p<0.05 - NOT consistent with HWE

Simulation 1 conclusions

  • Natural selection disrupts HWE

  • Natural selection shifts allele frequencies

Simulation 1: Expected Outcomes

  • Which allele will increase? (D or d) --> d

  • Will there be a deviation from HWE? Yes likely (might not be significant)

  • If yes to deviation, which genotype(s) would be overrepresented? Why? --> dd

Simulation 2: Random Mating

  • G0 mice turned their cards to their chest so the alleles were obscured

  • Generation 1 mice randomly picked two alleles from two different surviving generation 0 mice.

Simulation 2: Results

  • Allele frequencies stay the same

  • Populations return to HWE equilibrium

  • Random mating redistributes alleles between DD, Dd and dd genotypes

Simulation 3: Gene Flow

  • Generation 1 mice held their alleles and randomly moved around the room to the left or right.

  • They could cross the midline of the room

  • This will be a done as a worked example in next week’s Muddiest Point session

Simulation 3: Expected Outcomes

  • Which allele will increase? (D or d) --> No change

  • Will there be a deviation from HWE? No – should reset proportions

  • Which genotype(s) would be overrepresented? Why? --> n/a

Simulation 3: Results

  • p<0.05 - NOT consistent with HWE

Simulation 3: Gene Flow - Conclusions

  • Allele frequencies shift back towards p=0.5

  • Gene flow counteracts skewing of allele frequencies caused by natural selection