Human Genetics 31

Overview of Population Genetics and Frequency Calculations

  • The primary goal of this unit is to predict allele and genotype frequencies in populations and understand how they change over time.

  • There are three main types of calculations used to understand population genetics:     - Allele Frequencies: The frequency of a specific allele, such as the probability of selecting the AA allele versus the aa allele within a population.     - Genotype Frequencies: The frequency of a specific pair of alleles, such as the frequency of the homozygous dominant (AAAA), heterozygous (AaAa), or homozygous recessive (aaaa) combinations.     - Phenotype Frequencies: The frequency of observable traits, such as the incidence of the PKU (Phenylketonuria) illness in a population.

  • Data Sources for Analysis:     - Genetic analysis can look at Single Nucleotide Differences (SNPs) for alleles.     - Analysis can also involve non-gene regions, such as repeated sequences (used in DNA fingerprinting).     - Gene vs. Non-gene regions: Researchers may choose non-gene regions because genes are subject to evolutionary selective pressures (such as antibody selection or immune regions) which can interfere with frequency data. Non-gene regions are often more neutral.

  • The complexity of these calculations increases significantly when looking at multiple genes, multiple alleles at a single locus, or multiple phenotypes at once. For this introductory course, calculations are limited to two-allele, three-genotype systems.

Population Variation and Ancestry

  • Genetic frequencies vary significantly between different populations based on geography and historical interbreeding.

  • Tay-Sachs Disease: This condition has a much higher allele frequency in the Ashkenazi Jewish population compared to African Americans.

  • Sickle Cell Allele: Higher in African Americans and African populations because of the heterozygote advantage; individuals who are heterozygous (AaAa) are protected from malaria and are thus selected for in regions where malaria is or was prevalent.

  • Ancestry Data and Biases:     - Accurate ancestry data relies on reference data. Currently, most reference data is derived from Europe and North America.     - Current data is not representative of the Earth’s full population, though it is improving as more people (such as through 23andMe) are sequenced.

  • Phenotype Frequency Example (PKU):     - PKU is a metabolic condition where an individual cannot break down phenylalanine.     - Turkish populations: Highest incidence of PKU in the world.     - Japanese individuals: Lowest incidence of PKU in the world.

Microevolution and Macroevolution

  • Microevolution: Involves small, incremental steps of genetic change over time. An example is the slow growth of a phenotypic feature through small stages.

  • Macroevolution: Refers to large-scale evolutionary changes that occur more rapidly or represent the accumulation of sufficient microevolutionary changes to cause significant phenotype differences.

  • Speciation: Occurs when genetic changes are great enough that two individuals can no longer interbreed to produce fertile offspring. If a mom and dad produce offspring that are not fertile, they are considered different species.

  • Human vs. Ape Evolution Case Study:     - Significant differences exist in the FOXP2FOXP2 gene, which relates to communication and speech.     - Jaw Muscle Attachment: In apes, the jaw muscle attachment to the skull is very large and strong, requiring a thick, strong skull bone. Humans have a mutation that caused a smaller jaw muscle attachment. While this made chewing less efficient, it allowed the human skull to be thinner, providing space for brain expansion.

Calculating Allele Frequencies (pp and qq)

  • Allele frequencies are denoted by variables:     - pp = Frequency of the dominant allele (e.g., AA).     - qq = Frequency of the recessive allele (e.g., aa).

  • The sum of all allele frequencies in a population must equal 1:     - p+q=1.0p + q = 1.0.

  • Calculation Formula:     - p=2×(Number of homozygous dominant individuals)+1×(Number of heterozygotes)Total number of alleles in the populationp = \frac{2 \times (\text{Number of homozygous dominant individuals}) + 1 \times (\text{Number of heterozygotes})}{\text{Total number of alleles in the population}}.     - Note: The total number of alleles is twice the number of individuals in a diploid population.

  • Example calculation:     - Population: 20 AAAA, 30 AaAa, 50 aaaa (Total = 100 people, 200 alleles).     - Number of AA alleles in homozygous dominant: 20×2=4020 \times 2 = 40.     - Number of AA alleles in heterozygotes: 30×1=3030 \times 1 = 30.     - Total AA alleles: 40+30=7040 + 30 = 70.     - Allele frequency p=70200=0.35p = \frac{70}{200} = 0.35.     - Allele frequency q=130200=0.65q = \frac{130}{200} = 0.65.     - Verification: 0.35+0.65=1.00.35 + 0.65 = 1.0.

Genotype Frequencies and the Hardy-Weinberg Equation

  • Genotype frequencies are expressed as decimals.

  • Using the population from the previous example (100 people total):     - Frequency of AA=20100=0.2AA = \frac{20}{100} = 0.2.     - Frequency of Aa=30100=0.3Aa = \frac{30}{100} = 0.3.     - Frequency of aa=50100=0.5aa = \frac{50}{100} = 0.5.

  • The Hardy-Weinberg Equation:     - Derived from the normal distribution: (p+q)2=p2+2pq+q2=1.0(p + q)^2 = p^2 + 2pq + q^2 = 1.0.     - p2p^2: Frequency of homozygous dominant (AAAA).     - 2pq2pq: Frequency of heterozygotes (AaAa).     - q2q^2: Frequency of homozygous recessive (aaaa).

Hardy-Weinberg Equilibrium (HWE)

  • A population in HWE maintains the same allele frequencies generation after generation.

  • Five Assumptions for HWE:     1. Random mating.     2. No migration (gene flow).     3. No natural selection.     4. No genetic drift (requires a large population).     5. No mutation.

  • Generally, these assumptions are never fully met in nature (especially selection and mutation). Therefore, allele frequencies usually change over time.

  • Non-gene regions: Parts of the genome that do not affect phenotype (like repeated DNA segments) often follow Hardy-Weinberg equilibrium because they are not subject to natural selection.

Practical Applications and Problem Solving

Generation-to-Generation Example (Fingers)

  • Trait: DD (Normal fingers), dd (Short middle finger).

  • Initial Allele Frequencies: p=0.7p = 0.7, q=0.3q = 0.3.

  • Calculation of Genotype Frequencies:     - DD(p2)=(0.7)2=0.49DD (p^2) = (0.7)^2 = 0.49.     - Dd(2pq)=2×0.7×0.3=0.42Dd (2pq) = 2 \times 0.7 \times 0.3 = 0.42.     - dd(q2)=(0.3)2=0.09dd (q^2) = (0.3)^2 = 0.09.     - Verification: 0.49+0.42+0.09=1.00.49 + 0.42 + 0.09 = 1.0.

  • Gamete Frequency Calculation:     - To find the next generation's alleles (pp and qq):     - Big DD contribution from DDDD: 0.490.49.     - Big DD contribution from DdDd: 12×0.42=0.21\frac{1}{2} \times 0.42 = 0.21.     - Total next-gen p=0.49+0.21=0.70p = 0.49 + 0.21 = 0.70.     - Recessive dd contribution from DdDd: 12×0.42=0.21\frac{1}{2} \times 0.42 = 0.21.     - Recessive dd contribution from dddd: 0.090.09.     - Total next-gen q=0.21+0.09=0.30q = 0.21 + 0.09 = 0.30.

Cystic Fibrosis (Carrier Frequency)

  • Cystic Fibrosis is autosomal recessive. Phenotype incidence is easier to track than genotype.

  • Example: Incidence of phenotype (q2q^2) is 0.00050.0005.

  • Step 1: Find allele frequency q=0.00050.022q = \sqrt{0.0005} \approx 0.022.

  • Step 2: Find p=10.022=0.978p = 1 - 0.022 = 0.978.

  • Step 3: Find carrier frequency (2pq2pq). 2×0.978×0.0220.0432 \times 0.978 \times 0.022 \approx 0.043, which is approximately 11 in 2323.

  • Health Care Implications: Insurance companies and healthcare systems use these statistics to determine if genetic testing or screening (like IVF with embryo selection) is cost-effective. For example, colonoscopies were lowered to age 45 not out of kindness, but because it was financially beneficial for insurance companies to screen earlier.

X-Linked Traits (Hemophilia)

  • In males, genotype equals phenotype because they only have one X chromosome.

  • If 11 in 10,00010,000 males has hemophilia, then the allele frequency q=0.0001q = 0.0001.

  • This data allows the calculation of the carrier frequency in females using 2pq2pq.

  • This also allows for the prediction of affected females (q2q^2); for hemophilia, this is roughly 11 in 100,000,000100,000,000.

Rare Gene Shortcut (Tay-Sachs)

  • For rare autosomal recessive diseases, a shortcut for carrier frequency can be used: once you find qq (the square root of the incidence), the carrier frequency (2pq2pq) is approximately 2q2q because pp is so close to 1.01.0.

  • Example: Tay-Sachs incidence is 11 in 3,6003,600. q=13600=0.017q = \sqrt{\frac{1}{3600}} = 0.017. Carrier frequency is roughly 2×0.0172 \times 0.017.