Population Genetics Notes

Chapter 18: Population Genetics

18.1 Detecting Genetic Variation

  • Population genetics reveals geographic structure and likely origins of plant and animal species based on levels of genetic variation within and among populations.
  • It forms the foundation for studying evolutionary change in populations.
  • Applications include DNA forensics, paternity testing, and human disease mapping, all based on probabilities of individuals possessing a particular genotype relative to others in a population.
  • These applications rely on associations of specific molecular variants with disease states.

Revised Definitions for Population Genetics

  • Locus: A designated location on a chromosome; can be a single nucleotide or a stretch of many nucleotides.
  • Allele: A site at which DNA sequence differs among or between genomes; can be coding or non-coding.
  • Most nucleotide changes do not affect phenotype.

Variation Among Homologous DNA Sequences

  • Single Nucleotide Polymorphisms (SNPs): Detected by sequencing, PCR, and microarrays.
    • Common SNPs have a frequency of ≥ 5%, while rare SNPs have a frequency of < 5%.
    • SNPs can be in coding or non-coding regions (ncSNP).
    • If in a coding region, they can be synonymous, nonsynonymous, or nonsense.
  • Microsatellites: Repeats of short 2-6 bp motifs.
    • Different microsatellite alleles have different physical lengths (e.g., [AG]3 and [AG]5).

Haplotypes

  • Haplotype: A unique combination of alleles at multiple loci on the same chromosome or chromosome region.
  • Example: A prevalent Y-chromosome haplotype among Asian men may trace back to Genghis Khan.
  • Haplotype networks illustrate human Y chromosome variation and geographical distribution.

Gene Pool Characterization

  • The gene pool can be characterized by genotype and allele frequencies.
  • Genotype frequencies: What are the frequencies of AA, Aa, and aa?
  • Allele frequencies: What are the frequencies of A and a?

Calculating Genotypic Frequencies

  • Example: In a population of 16 individuals: 5 AA, 8 Aa, 3 aa.
  • Frequency of AA (fAA) = 5/16 = 0.3125
  • Frequency of Aa (fAa) = 8/16 = 0.5
  • Frequency of aa (faa) = 3/16 = 0.1875

Calculating Allele Frequencies

Method 1: From Population Sample
  • In 16 diploid individuals, there are 32 alleles.
  • Homozygotes carry two copies of the same allele.
    • AA individuals carry two A alleles.
    • aa individuals carry two a alleles.
  • Heterozygotes carry one copy of each allele.
    • Aa individuals carry one A and one a allele.
  • Formula:
    • fA(p)=2(numberofAA)+numberofAa/totalnumberofallelesf_A (p) = {2(number of AA) + number of Aa} / {total number of alleles}
    • fa(q)=2(numberofaa)+numberofAa/totalnumberofallelesf_a (q) = {2(number of aa) + number of Aa} / {total number of alleles}
  • Example:
    • p=2(5)+8/32=0.5625p = {2(5) + 8} / {32} = 0.5625
    • q=2(3)+8/32=0.4375q = {2(3) + 8} / {32} = 0.4375
  • Note: Allele and genotypic frequencies sum to 1.
    • p+q=1p + q = 1
    • f<em>AA+f</em>Aa+faa=1f<em>{AA} + f</em>{Aa} + f_{aa} = 1
Method 2: From Genotypic Frequencies
  • Formula:
    • p=f<em>AA+(0.5f</em>Aa)p = f<em>{AA} + (0.5 * f</em>{Aa})
    • q=f<em>aa+(0.5f</em>Aa)q = f<em>{aa} + (0.5 * f</em>{Aa})
  • Example:
    • p=0.3125+(0.50.5)=0.5625p = 0.3125 + (0.5 * 0.5) = 0.5625
    • q=0.1875+(0.50.5)=0.4375q = 0.1875 + (0.5 * 0.5) = 0.4375

Hardy-Weinberg Equilibrium (HWE)

  • In a randomly mating, sexually reproducing population, allele and genotype frequencies remain unchanged from one generation to the next.
  • Constant frequencies form the equilibrium distribution, also known as Hardy-Weinberg Equilibrium (HWE).
  • Genotype frequencies in terms of p and q:
    • AA: p2p^2
    • Aa: 2pq2pq
    • aa: q2q^2
  • Where:
    • p is the frequency of the A allele
    • q is the frequency of the a allele
  • HWE is only true for a population if several assumptions are met.
  • Effect of sexual reproduction on variation.

Predicting Frequencies in the Next Generation

  • Given specific genotype and/or allele frequencies in generation t, can we predict these frequencies in generation t + 1?
  • Assumptions:
    • fAA=0.25,p=0.5f_{AA} = 0.25, p = 0.5
    • fAa=0.50,q=0.5f_{Aa} = 0.50, q = 0.5
    • faa=0.25f_{aa} = 0.25
  • Gametes are produced in proportion to the relative abundances of A and a alleles in generation t. Assume random mating.
  • Formulas for next generation (t+1):
    • fAA=(pp)=p2f'_{AA} = (p * p) = p^2
    • fAa=(pq)+(qp)=2pqf'_{Aa} = (p * q) + (q * p) = 2pq
    • faa=(qq)=q2f'_{aa} = (q * q) = q^2
  • Example:
    • fAA=(0.5)2=0.25f'_{AA} = (0.5)^2 = 0.25
    • fAa=2(0.5)(0.5)=0.5f'_{Aa} = 2(0.5)(0.5) = 0.5
    • faa=(0.5)2=0.25f'_{aa} = (0.5)^2 = 0.25
  • If the frequencies remain the same from generation t to t+1, the population is in Hardy-Weinberg Equilibrium.

Hardy-Weinberg Equilibrium Explained

  • Describes a special relationship between allele frequencies and genotype frequencies.
    • p2p^2 = frequency of AA homozygotes
    • 2pq2pq = frequency of Aa heterozygotes
    • q2q^2 = frequency of aa homozygotes
  • Requires that several assumptions are met.

Applying HWE: Example Calculation

  • Consider a population: 90 AA, 420 Aa, 490 aa (Total = 1000)
  • Calculate genotype frequencies:
    • fAA=90/1000=0.09f_{AA} = 90/1000 = 0.09
    • fAa=420/1000=0.42f_{Aa} = 420/1000 = 0.42
    • faa=490/1000=0.49f_{aa} = 490/1000 = 0.49
  • Assuming Hardy-Weinberg equilibrium:
    • p=sqrt(fAA)=sqrt(0.09)=0.3p = {sqrt(f_{AA})} = {sqrt(0.09)} = 0.3
    • q=sqrt(faa)=sqrt(0.49)=0.7q = {sqrt(f_{aa})} = {sqrt(0.49)} = 0.7

Assumptions of Hardy-Weinberg Equilibrium

  • Under assumptions of Hardy-Weinberg equilibrium, allele and genotypic frequencies remain constant across generations.
  • Assumptions:
    • Mating is random
    • No natural selection
    • No subpopulation structure
    • Population is large (no genetic drift)
    • No mutation
    • No gene flow (i.e., no migration)

Hardy-Weinberg Equilibrium and Rare Alleles

  • Most copies of rare alleles are found in heterozygous condition.
  • Example:
    • If q=0.001q = 0.001, then
    • faa=q2=(0.001)2=0.000001f_{aa} = q^2 = (0.001)^2 = 0.000001
    • fAa=2pq=2(0.999)(0.001)=0.001998f_{Aa} = 2pq = 2 * (0.999) * (0.001) = 0.001998
    • Rare alleles are more likely to be found in heterozygous form.

Departures from Hardy-Weinberg Equilibrium

  • Departures from HWE indicate that the required assumptions are NOT met and thus evolutionary forces are acting on a population.
  • Example: For p = 0.4 and q = 0.6
    • In HWE:
      • fAA=p2=0.16f_{AA} = p^2 = 0.16
      • fAa=2pq=0.48f_{Aa} = 2pq = 0.48
      • faa=q2=0.36f_{aa} = q^2 = 0.36
    • Not in HWE:
      • fAA=0.28f_{AA} = 0.28
      • fAa=0.24f_{Aa} = 0.24
      • faa=0.48f_{aa} = 0.48
  • Allele frequencies can always be calculated from genotype frequencies, but the HWE relationship does not hold if assumptions of HWE are violated.

Non-Random Mating

  • Non-random mating is a violation of HWE assumptions.
    1. Assortative mating: Based on phenotypic resemblance.
      • Positive assortative mating: like mates with like; increases homozygosity.
      • Negative assortative mating: like mates with unlike; increases heterozygosity.
    2. Isolation by distance: Mate only with neighbors.
      • Leads to population structure: allele frequencies vary across the landscape.
    3. Inbreeding: Related individuals mate more often than would occur by chance.
      • Increases homozygosity.
      • Enforced outbreeding also possible: related individuals mate less often than would occur by chance.

Isolation by Distance: Example

  • Allele frequency may vary along a gradient.
  • Example data from Kansas City, Hutchinson, and Elkhart showing varying allele frequencies and HWE status.

Inbreeding

  • Inbreeding increases homozygosity and increases the possibility that an individual will possess alleles that are identical by descent.
  • Identical by descent (IBD): Probability that 2 alleles are derived from the same SINGLE allele that exists (existed) in an earlier generation.
  • Probability of alleles being IBD is measured by the inbreeding coefficient (F).

Calculating Inbreeding Coefficient (F)

  • Identify 'inbreeding loops'
  • Formula: F=(1/2)nF = (1/2)^n, where n is the number of individuals in the loop (loop doesn’t include individual for whom F is calculated).
  • Examples:
    • Half-sib mating: F=(1/2)3=1/8F = (1/2)^3 = 1/8
    • Brother-sister mating: F=(1/2)3+(1/2)3=1/4F = (1/2)^3 + (1/2)^3 = 1/4
    • Parent-offspring mating: F=(1/2)2=1/4F = (1/2)^2 = 1/4
    • First-cousin mating: F=(1/2)5+(1/2)5=1/16F = (1/2)^5 + (1/2)^5 = 1/16

Inbreeding and Recessive Diseases

  • Inbreeding drastically increases the probability that offspring will inherit recessive diseases (that rare alleles will be found in homozygous condition).
  • Modified genotype frequencies to account for inbreeding:
    • fAA=p2+pqFf_{AA} = p^2 + pqF
    • fAa=2pq2pqFf_{Aa} = 2pq – 2pqF
    • faa=q2+pqFf_{aa} = q^2 + pqF
  • Example: If q = 0.01 and F = 0.25
    • faa=(0.01)2+(0.99)(0.01)(0.25)=0.002575f_{aa} = (0.01)^2 + (0.99)(0.01)(0.25) = 0.002575

Genetic Drift

  • Genetic drift = random fluctuation in allele frequencies due to ‘chance’ events.
  • Effect is inversely proportional to population size
    • Genetic drift has weaker effects in larger populations
    • Genetic drift has stronger effects in smaller populations
  • Can result in loss or fixation of alleles
  • Similar to effects of inbreeding
    • Increases homozygosity, decreases heterozygosity

Founder Effect

  • The founder effect: genetic drift arising from sampling of a larger population during the ‘founding’ of a new population.
  • Genetic diversity reduced in human populations that have experienced founder events.

Genetic Drift and Neutral Theory

  • Probability of fixation of an allele is equal to its starting frequency.
  • Initial frequency of new mutation is 1/2N1/2N, where N is population size.
  • Probability of loss is thus 1(1/2N)1 – (1/2N); most mutations are lost!
  • Substitution rate (k) = 2Nμ1/2N=μ2Nμ * 1/2N = μ
    • Where μ is the mutation rate.
  • Substitution rate serves as a molecular clock if μ is constant over time.
  • Alleles that become fixed are called substitutions.

Natural Selection

  • Natural selection: differential rates of survival and reproduction among different genotypes.
  • Fitness:
    • Consequence of relationship between phenotype and environment.
    • Same genotype may have different fitness in different environments.
    • Absolute fitness (W): Number of offspring produced per individual or genotype.
    • Relative fitness (w): Fitness of an individual or genotype relative to the individual or genotype with the highest fitness; typically bounded by 0 and 1.

How Selection Alters Allele Frequencies

  • Differential survival of genotypes (e.g., a/a is lethal).
  • Corrected frequencies are calculated by dividing viable genotype frequencies by the sum of those frequencies.
  • Instead of being lethal, genotype a/a might have reduced fitness relative to genotypes A/A and A/a.
  • We can assign a relative fitness value (w) to each genotype:
    • w = 0, lowest fitness
    • w = 1, highest fitness
  • After selection, allele frequencies change.

Selection on Dominant Versus Recessive Allele

  • Selection can act differently on dominant versus recessive alleles, influencing how quickly allele frequencies change.

Forms of Selection

1. Directional Selection:
  • Moves an allele frequency in one direction until it’s fixed or lost.
    • A. Positive selection: Brings a new, favorable allele to a higher frequency.
      • Selective sweep: when a favorable allele reaches fixation in a population.
    • B. Purifying selection: Removes deleterious mutations from the population and prevents degradation of existing adaptive traits.
2. Balancing Selection:
  • If the heterozygous genotype has higher fitness than either homozygote, both (or multiple) alleles are maintained in the population.
  • Produces increased genetic variation in and around the site of selection.