Molecular Evolution Study Notes

Module 03: Molecular Evolution

Learning Objectives

  • Describe the organization of a typical eukaryotic genome in terms of coding and non-coding DNA.

  • Define homologous, orthologous, and paralogous genes and explain which one is most useful for inferring evolutionary history.

  • Define synonymous, non-synonymous, and non-coding DNA and explain how they can be used to infer evolutionary processes.

  • Explain the rationale for the nearly-neutral theory and how it helps us detect selection for (or against) DNA variants.

  • Compute Ka/Ks and interpret whether a region of DNA sequence is neutral, beneficial, or deleterious.

  • Give examples of the three possible fates of duplicated genes and explain which is the most likely/common fate.

  • Describe the contribution of transposable elements to genome variation.

  • Describe the DNA "footprint" that each evolutionary force leaves on genomic variation.

What is Molecular Evolution?

  • Definition: Molecular evolution refers to the evolution occurring at the level of nucleic acids and proteins.

  • Key Processes: Evolution at the molecular level primarily occurs due to changes in the genome sequence, which subsequently influences protein structures and functions.

Understanding the Genome

  • Definition of a Genome: The genome is the complete set of chromosomes (karyotype) found in an organism.

    • Human Genome Composition: Includes both autosomes (non-sex chromosomes) and sex chromosomes (X, Y).

Genome Structure Differences Among Domains
  • Eukaryotes: Genome is linear.

  • Bacteria and Archaea: Genome is circular.

  • Viruses: Can have either circular or linear DNA or RNA structures.

Visualizing the Genome
  • Zoom Levels:

    • Whole genome at ~290X zoom allows distinction of individual genes (exons) but not detailed visualization of DNA sequences.

    • At a further ~10,000X zoom, DNA sequences can be visualized in depth.

Eukaryotic Genome Organization

  • Components of the Eukaryotic Genome:

    • Exons: Sequences that code for proteins.

    • Introns: Non-coding sequences between exons.

    • Intergenic Spaces: Non-coding sequences found between genes.

Contribution of Transposable Elements
  • Transposable Elements Include:

    • LTR retrotransposons

    • DNA transposons

    • SINES (Short Interspersed Elements)

    • LINES (Long Interspersed Elements)

    • Miscellaneous unique sequences and heterochromatin.

  • Contribution to Variation: These elements account for various percentages of genome structure and contribute to genetic diversity.

Gene Counts and Genome Sizes of Various Organisms

Organism

Haploid Genome Size (Mb)

Number of Genes

Genes per Mb

Bacteria: Haemophilus influenzae

1.8

1,700

940

Bacteria: Escherichia coli

4.6

4,400

950

Archaea: Archaeoglobus fulgiclus

2.2

2,500

1,130

Archaea: Methanosarcina barkeri

4.8

3,600

750

Eukaryotes: Saccharomyces cerevisiae

12

6,300

525

Eukaryotes: Utricularia gibba

82

28,500

348

Eukaryotes: Caenorhabditis elegans

100

20,100

200

Eukaryotes: Arabidopsis thaliana

120

27,000

225

Eukaryotes: Drosophila melanogaster

165

14,000

85

Eukaryotes: Daphnia pulex

200

31,000

155

Eukaryotes: Zea mays

2,300

32,000

14

Eukaryotes: Ailuropoda melanoleuca

2,400

21,000

9

Eukaryotes: Homo sapiens

3,000

21,300

7

Eukaryotes: Paris japonica

149,000

ND

ND

Evolutionary History in Genomes

  • Gene Sequence Similarity: Related organisms share similar gene sequences, which can be utilized to determine evolutionary relationships.

  • DNA Change Rates: The type of DNA influences its rate of change, affecting evolutionary interpretations:

    • Ribosomal RNA: changes slowly and is useful for detecting ancient relationships.

    • Mitochondrial DNA: evolves relatively quickly and is used for more recent evolutionary analyses.

Molecular Homology

  • Process: DNA sequences are aligned to identify similar sequences between different species, where closely related species differ only slightly, while distant species may have significantly different sequences.

  • Challenges: Insertions and deletions can complicate the sequence alignment of closely related species.

Gene Types

Homologous Genes
  • Orthologous Genes: The same gene found in different species, arising from speciation events, providing insight into evolutionary history and facilitating phylogenetic tree construction.

  • Paralogous Genes: Copies of the same gene within the same or different species, arising from duplication events.

  • Two Main Types of Homologous Genes: Orthologous and paralogous genes must be correctly identified to avoid misinterpretations in molecular evolution studies.

Phylogenetic Trees

  • Ortholog and Paralog Tree Construction: Illustrates the relationships between different species and their gene sequences based on evolutionary history.

Central Dogma of Molecular Biology

  1. Process Overview: DNA is transcribed into RNA, which is then translated into protein.

  2. Example:

    • Sequence: 5′ - A T G A C A C T - 3′ (Coding Strand)

    • Complement: 3′ - T A C T G T G A - 5′ (Template Strand)

Alterations of Chromosome Structure

  • Types of Changes:

    • Deletion: Removal of a chromosomal fragment.

    • Duplication: Repetition of a chromosomal segment.

    • Inversion: Reversal of segment orientation within a chromosome.

    • Translocation: Movement of a segment from one chromosome to another.

Mutations

  • Definition: Mutations are modifications in the genetic information of a cell, which can occur at various scales.

  • Point Mutations: Changes in a single nucleotide pair, which may lead to abnormal protein production or genetic disorders.

Insertions and Deletions (Indels)
  • Effects on Protein: Indels can have drastic effects on resultant proteins, often more than substitutions, and may lead to frameshift mutations.

Alleles and Their Variation

  • Heterozygous vs. Homozygous: Discuss how alleles at a specific locus vary between organisms (heterozygous) or are identical (homozygous).

Point Mutation Types

  • Silent Mutation: No change to amino acid; often synonymous due to redundancy in genetic code.

  • Missense Mutation: Leads to one amino acid change; may be harmful or occasionally beneficial.

  • Nonsense Mutation: Results in a stop codon and a non-functional protein.

Likelihood of Silent Mutation
  • Changes to the third base pair in a codon are less likely to result in a non-silent mutation.

  • Changes to start codons are never silent.

Nucleotide Substitutions and Typology

  • Synonymous Substitutions: Do not alter amino acid sequence and are considered neutral.

  • Non-synonymous Substitutions: Change the amino acid; may be adaptive, deleterious, or neutral.

  • Non-coding DNA Effects: Substitutions can affect gene expression levels and potential phenotypic outcomes.

Causes of Genomic Diversity Among Species

  • Genetic Variation: Mutations change base pair sequences; lethal mutations are typically removed by natural selection.

  • Hypotheses of Variation:

    • Selectionist Hypothesis: Genetic variations are primarily due to natural selection.

    • Neutral Hypothesis: Genetic variations occur due to random chance.

    • Nearly Neutral Hypothesis: A combination of natural selection and random chance causes genetic variation.

Selectionist vs. Nearly Neutral vs. Neutral Theories

  • Selectionist Theory:

    • All mutations significantly impact fitness through natural selection.

  • Neutral Theory:

    • Most mutations are neutral, driven by random genetic drift.

    • Implications are significant for understanding evolutionary processes across various populations, with population size influencing the balance between selection and drift.