Chapter 8

Alternative copies of a gene exist side by side within populations, and they have a lineage that traces their history back through time

  • can be morphological (like the presence or absence of feathers) or molecular (presence of adenine at a certain position of a certain gene) characters

Gene trees reconstruct the historical relationships among alleles within and between populations

BRCA1 gene

  • tumor-suppressor gene

  • mutations can increase a woman’s risk of developing breast or ovarian cancer

  • only takes one single mutation to a single nucleotide to create the risk

  • may have arisen in the egg or sperm that created the zygote or it could’ve been inherited from one of the parents

  • if the mutated variant was in the egg and the sperm carried the unmutated variant, then the zygote carries one benign and one pathogenic variant.

    • if the zygote develops into a woman, she would be at risk of breast or ovarian cancer

    • approximately half of her kids would inherit the mutated variant, who then might pass it on to their kids

  • some alleles get replicated while others fail to be transmitted

  • synapomorphy because individuals with similar when they inherit a state from a common ancestor

Gene tree is the branched genealogical lineage of homologous alleles that traces their evolution back to an ancestral allele

Coalescence tells us if the population has experienced any major changes in size or if it has undergone natural selection

  • to examine the events, read the trees in the reverse direction, beginning with the tips

Nodes are splitting events

  • the points where two lineages converge or coalesce into a single ancestral lineage

  • coalescence events occur at the most recent common ancestor of any two alleles

Positive selection can accelerate the rise in the frequency in an allele

  • shortening time to fixation → short coalescence

It is possible to trace the genealogies of genes back through time, reconstructing when mutations generated new alleles and how these alleles subsequently spread

Incomplete lineage sorting - because alternative alleles persist side by side for a very long time, they may be passed down to daughter species in a fashion that does not reflect the actual branching history of the species

  • Initially, a population will have many alleles of a gene

  • when the population splits, several alleles might be carried together into both of the resulting species

  • if the lineages split again, some of the alleles will be carried once more

  • eventually, some alleles will be lost due to drift

  • so if you take the sample of alleles from some daughter species, they may be different from the original ancestral species

    • might not reflect the actual branching history of species

  • Normally, we expect species A and B to be more closely related than species C if A and B share a recent common ancestor.

  • But if an ancestral population had multiple alleles, some alleles in A and C may be more similar to each other than to those in B.

  • This creates a gene tree that conflicts with the expected species tree.

Paralogs are homologous genes arising from gene duplication.

  • form a gene family

  • within the same species

Introgression - occasionally, gene copies from one species will be introduced into the genome of a second species

  • genes through hybridization

  • if the genes carry beneficial variation, they may be favored by selection and retained within the genomes of the recipient species

  • If we construct a gene tree using that introgressed gene, it may show that Species A and B are more closely related than they actually are (since they share the gene).

  • However, if we look at the species tree based on multiple genes, we may see that A and B are not true sister species.

Both incomplete lineage sorting and introgression result in gene trees that differ from true phylogeny of the species.

Studying some genes between humans, chimpanzees, gorillas, and organutans found that humans and chimpanzees are closely related. However, studying other genes pointed to gorillas as our closest relatives

Phylogenetic trees are hypotheses about the relationships among species or groups of individuals

Analytical approaches to select the phylogeny that best approximates the actual history of a group

  • maximum parsimony: the simplest solution is the most reasonable one

    • the tree with the fewest number of character state changes

    • can be misleading when homoplasy is considered

    • to reduce the effects, scientists focus on informative portions of genomes (exons)

    • introns and intervene regions have more variable sequences but are not homoplasy due to random convergence in base pair

  • distance-matrix methods: closely related species will have more similarities than distantly related species

    • convert DNA or protein sequences from different taxa into a pairwise matrix of the evolutionary distances (dissimilarities) between them

    • used to estimate the lengths of the branches in the tree by equating the genetic distance between nodes with the length of the branch

    • neighbor-joining method: scientists pair together the two least-distant species by joining their branches at a node and then join this node to the next closest sequence

  • maximum likelihood methods: requires a substitution model, which describes how DNA, RNA, or protein sequences change over time

    • For each tree, it calculates the likelihood (probability) of observing the given genetic data, given the chosen substitution model.

    • the better trees are those for which the data are most probable

  • Bayesian model: use statistical models to determine the probability of a tree given a particular data set

    • integrate over multiple possible trees, rather than selecting a single "best" tree like maximum likelihood does.

Bootstrapping:

  • select a random sample of characters from their full data set

  • create a new data matrix and use it to generate a potential phylogeny

  • repeat the process, randomly selecting characters and creating a potential phylogeny

  • after doing this many times, compare phylogenies

  • if the trees are very different, it means the data is poor support for the original tree

  • if the trees are similar, it indicates stronger support

Purifying selection: removes deleterious alleles from a population

  • negative selection

Two hypotheses for how Homo sapiens evolved:

  • multiregional model: evolved gradually across the entire Old World from an older hominin species over the past 1 million years

  • out-of-Africa model: all major ethnic groups of humans are derived from recent African ancestry

    • earliest fossils are found in Africa

Analysis of DNA from Africans and compared to people from other parts of the world

  • identified nuclear microsatellites, sections of repeated DNA that have a very high mutation rate

  • used the neighbor-joining method, and constructed a tree that revealed where most human genetic diversity can be found — in Africa

  • all non-Africans form a monophyletic group suggesting that they diversified after migrating out of Africa

Lentiviruses infect mammals by invading certain types of white blood cells

  • SIV infects monkeys and apes, close to HIV

  • HIV is not a monophyletic group as different strains have different origins

Neutral mutations accumulate in a clocklike fashion in genomes

  • scientists can use molecular clocks to estimate the origin of diseases and major clades

Neutral mutations can spread to fixation due solely to processes such as genetic drift

Non-coding DNA (including pseudogenes) has no function

  • mutations to these sequences are not likely to affect the phenotypes of the individuals that carry them, so they are not likely to be exposed to selection

Protein-coding genes could also escape selection

  • synonymous (silent) substitution: several codons may encode the same amino acid

    • does not mean they are completely immune to selection’s effects

    • may affect how efficiently a particular protein is translated even if it does not alter the resulting structure of the protein

  • nonsynonymous substitution: replaces one amino acid with another

    • a mutation that does change an amino acid in a protein may still fail to change the function of the protein

Motoo Kimura:

  • although natural selection could change phenotypic adaptations, much of the variation in genomes was the result of drift

  • predicted: neutral mutations would become fixed in populations at a roughly regular rate

  • the more time that passed after the lineages diverged, the more different mutations would be fixed in each one

  • cytochrome c: the more distantly related two species were, the more mutations had accumulated in each lineage since they split from a common ancestor

    • by counting the number of baser pair substitutions in a species’ cytochrome c, it is possible to estimate how long ago its ancestor’s branched off from our own

Molecular clock:

  • since most mutations in non-coding regions (or synonymous mutations in coding regions) are neutral, their rate of accumulation is proportional to time

Neutral Theory of Molecular Evolution

  • describes the pattern of nucleotide sequence evolution under the forces of mutation and random genetic drift in the absence of selection

  • predicts that neutral mutations will yield nucleotide substitutions in a population at a rate equivalent to the rate of mutation, regardless of the size of the population

  • as long as mutation rates remain fairly constant through time, neutral variation should accumulate at a steady rate, generating a molecular signature that can be used to date events in the distant past

Positive selection and purifying selection both leave distinctive signatures in nucleotide or amino acid sequences that can be detected using statistical tests

When a neutral mutation arises in a large population, it may take a very long time for it to reach a high frequency through drift

When an allele experiences strong natural selection, it can spread quickly through a population

  • selective sweep: when strong selection can “sweep” a favorable allele to fixation within a population, resulting in little opportunity for recombination

  • genetic hitchhiking: alleles that sit on the same chromosome when the mutation occurred get pulled along for the ride

    • so as the mutation becomes more common, so do these alleles

Linkage Disequilibrium: Digest Milk as Adults

  • many people stop producing lactase when they stop drinking milk

    • natural selection should favor this as it means mammals don’t waste energy on making an enzyme with no advantage

  • 30% of people still produce lactase, so they can still consume milk and dairy products as adults

    • mutations gave rise to alleles conferring lactose tolerance in adults

Gene flow among populations works to homogenize their allelic populations. While drift and selection act within populations to diverge.

FST measures the extent of subdivision among populations

  • ranges from 0 (fully homogenized) to 1(fully segregated)

  • originally used to measure gene flow between populations, now used to measure how natural selection acts on populations

FST outlier method:

  • used to detect loci (specific regions of the genome) that show unexpected genetic divergence, which may indicate selection acting on it

  • Tibetan Plateau:

    • partial pressure of oxygen in airdrops as elevation increases

    • two strong outliers located next to EPAS1 and EGLN1 genes, known to affect oxygen physiology

Under purifying selection, harmful nonsynonymous mutations are purged from the population, so they accumulate more slowly than synonymous mutations. This leads to a lower dN compared to dS, resulting in a dN/dS ratio of less than 1.

  • nonsynonymous mutations are under negative (purifying) selection because they are being removed from the population faster than synonymous mutations, which do not affect fitness

When positive selection is acting, beneficial nonsynonymous mutations spread more quickly, leading to an increase in the rate of nonsynonymous mutations compared to synonymous mutations. This results in a dN/dS ratio greater than 1

  • When dN/dS > 1, it suggests that positive selection is acting on the gene, favoring the spread of beneficial nonsynonymous mutations, which accumulate faster than the synonymous mutations

  • because the mutation is beneficial, its frequency increases at faster rate

When dN/dS = 1, it suggests neutral evolution. This is because neutral mutations (those that do not affect fitness) accumulate in the genome at equal rates for both nonsynonymous and synonymous mutations, resulting in a dN/dS ratio of 1

  • null hypothesis

BRCA1

  • gene associated with breast cancer

  • when it is not cancer-causing, it is associated with several vital functions

    • including overseeing repairs to damaged DNA

  • researchers compared orthologs in many species

  • on some branches, dN/dS < 1

    • negative selection eliminated nonsynonymous mutations that disrupted the gene’s function

  • few branches, dN/dS > 1

    • positive selection

  • humans have 22 nonsynonymous substitutions to 3 synonymous substitutions

    • many of these changes result in breast cancer risk

    • speculate that when cells divide and make new copies of their DNA, some viruses slip their own genetic material into our cellular machinery and make copies of themselves

    • mutations allow the genes to shut viruses out, but viruses may evolve new adaptions to evade the gene

The size of the bacterial genome is proportional to the number of genes in each species

  • increase genomes by gaining new genes

  • an accidental duplication can create an extra copy of genes

  • or horizontal gene transfer can give bacteria new genes

  • deletions can cause them to shrink

robot