0.0(0)

GENETICS 2581 - MODULE 4

MODULE 4 - Origin of Genome Sequence Variation 

— Learning Outcomes

  1. Classify changes to genome sequence

  2. Identify the origin of mutations

  3. Identify the consequences of the mobilization of transposable elements and their effects on genome variation

  4. Distinguish between different mechanisms of transposition

  5. Use genome-wide association studies to map genes that are important for phenotypic variation 


— 4.1 Types of Mutations

Mutations, four classes of DNA sequence changes, mutation rates, consequences of mutation rates 

  • Mutation 

  • Stable genome sequence variations; not transient

  • A small proportion of genome sequence variations result in change of phenotype

  • (mutation =/ new phenotype

  • Classes of DNA sequence variations 

  1. Substitutions

  2. Indels 

  3. Inversions

  4. Translocations 

  • Substitutions 

  • Single base substitutions result in SNPs; base substitution results in a new allele (2-bp change) 

  • Substitutions can be subclassed into transitions and transversions

    • Purine → purine OR pyrimidine → pyrimidine = transitions 

    • Pyrimidine → purine OR purine → pyrimidine = transversions 

  • Indels (insertions/deletions) 

  • Smallest: insertions and deletions of a single base pair 

  • Larger: insertion or deletion of many bases (ex. with transposons) 

    • Triangle: large triangle 

    • Gap: large deletion 







  • Inversions

  • Smallest: inversion of two bases ( complement + flipped ) 

  • Larger inversion: Can be Mb in length 

  • Sequences identical at the beginning and end; reverse complement of the reference

  • Translocations 

  • Joining of two regions from 2 different chromosomes together 

  • Movement of DNA between  non-homologous chromosomes; ds-break (if homologous = regular crossing over) 

  • Reciprocal translocation = movement/exchange in both directions of both chromosomes

  • Non-reciprocal translocation = movement in one direction 

  • Mutation rate (how frequent are changes occurring) 

  • Rate = d/t 

  • Mutations over time: per n number of hours, days, years, lifetime…. 

  • Relative to biologically relevant events: per cell division/replication round, per gamete, per generation… 

  • 2 main mutation rates to consider 

  • Gene mutation rate: 

    • how often a wild-type allele changes to a mutant allele 

    • Impacted by the size of the genes identified & how they were assessed

  • Sequence mutation rate: 

    • how often a stable change in sequence (per base pair) within the whole genome 

    • Bacterial rate: 1-10 x 10^-10 / bp division 

      • 1-10 DNA sequence changes per base pair per 10 billion divisions 

      • 1-10 DNA sequence changes per 10 billion base pairs per division 

  • Consequences of mutation rate (change = generation of variation) 

  1. Evolutionary change

  2. Mosaicism 

  3. Disease 

  4. Animal cloning

*Somatic sequence variation: approximately ~1 mutation per division in the genome

  • For cloned mammals: somatic cell accumulates mutations so the cloned individual differs from the donor


— 4.2 Origin of Change 

How spontaneous and mutagen-induced mutations occur, compare chemical structure of a mutagenic base analog with nucleic acids & impact on base pairing, origin of mutations based on DNA sequence changes 

  • Intro

  • Transversions are half as common as transitions although there are more possibilities for exchange

  • Mechanisms which affect commonalities give rise to mutation 

  • Spontaneous mutations (don’t require external catalyst)

  1. Replication errors

  • Tautomeric shifts

  • Wobble 

  • Strand slippage 

  1. Unequal crossing over during meiosis 

  2. Chemical changes

  • Deamination 

  • Depurination

  • Tautomeric shifts (replication error)

  • Rare tautomeric forms can occur for each of the bases (ATCG) 

  • Shift in proton (H+) from one molecular pairing to another; changes base pairing with respective partners

    • Standard basepairing: T=A, CG 

    • When a base is in its tautomeric form, it introduces mismatch if present during replication as an ‘incorporated error (C=A, TG) 


















  • Hydrogen bonding basics 

  • H-bonding results from imbalances of electrons in polar molecules without altering overall neutral charge (but relative electronegative forces differ per side)

  • Causes alterations in different attractive forces between slightly positive/negative sides of different polar molecules

  • Hydrogen forms H-bond with N, O, F (highly electronegative) regardless of other attachments in the molecule –

  • Leads to anomalous base pairing arrangements → introduces mismatch if present during replication

  • Wobble (replication error)

  • ‘Flexibility’ in the bases; ‘flipping’ can cause incorrect base pairing

  • Wobble form can introduce mismatch during replication → incorporated error

  • More rare than tautomers




















  • Strand slippage (replication error) 

  • For low-complexity DNA strands (i.e with many repeats), as it moves along the template strand the polymerase may produce a ‘bubble’ 

    • Introduces short insertions or deletions → strand slippage; common in DNA replication

  • Unequal crossing over (spontaneous)

  • For low-complexity repetitive DNA regions; occurs during crossing over in meiosis or recombination during repair 

  • When homology lines up resulting in exchange of genetic material, there may be section of misalignment in low-complexity regions

    • Crossing over can introduce short insertions/deletions when material is exchanged


















  • Deamination (chemical change) 

  • Removal of an amine NH2 group spontaneously with +H2O in the process

  • Can occur naturally although DNA is relatively stable 

    • Cytosine → cytosine deamination → uracil 

    • Uracil is readily recognizable by repair machinery (specific DNA glycosylase) and rarely results in mutation; recognizes uracil through direct repair 

  • DNA methylation also in mammals… 

    • DNA methylation occurs on certain cytosines throughout the genome and CH3 is added onto cytosine base causing it to look identical to thymine instead of uracil 

    • Thymine less recognizable by repair machinery; pairs with A during replication → incorporated error

  • Depurination (chemical change) 

  • Loss of purine (guanine, adenine) due to a break in the bond between a deoxyribose and purine base

  • Backbone remains intact → results in a apurinic or abasic site 

  • During replication, gap at that location causes DNA polymerase to incorporate a random base leading to errors and mutation

    • Note: in bacteria, incorporation is commonly +adenine resulting in transversion 

  • Mutagens → initiate changes in the DNA 

  1. Base analogs

  2. Alkylating agents

  3. Deaminating chemicals

  4. Hydroxylamine 

  5. Oxidative radicals

  6. Intercalating agents

  7. Ionizing radiation and UV light 

  • Base analogs (mutagens) 

  • Molecules which mimic a normal base and can be incorporated during replication instead of the original base 

    • (ex.) 5-Bromouracil mimics thymine; normally can replace thymine and pair with adenine unless it enters an ionized form (loss of hydrogen), which can cause mispairing and error during replication 

    • Ionized form (NH → N-) can cause mispairing error in replication, resulting in transition 

  • Alkylating agents (mutagens) 

  • Addition of alkyl groups and can cause incorporation error by mispairing  (methyl, ethyl, etc.)

    • (ex.) EMS adds an ethyl group to G or T; when ethyl is added to guanine it frees 2 available H-bonding sites which can pair with thymine

    • If thymine were ethylated, it would lead to transition to cytosine instead 

  • Deaminating chemicals (mutagens) 

  • Nitrous acid (+HNO3) causes deamination of C or A 

  • Results in incorporation error by mispairing, resulting in transition 

  • Hydroxylamine (mutagens)

  • Adds hydroxyl group (HO) to cytosine and loses NH2; changes in electronegativity causes incorporation error by mispairing (transition) 

  • Oxidative radicals/ROS (mutagens) 

  • Molecules which contain oxygen with an odd number of electrons; highly reactive and can alter bases

  • Addition of oxygen ends up ‘twisting’ the molecule and the wrong hydrogen ends up base pairing → transversion results 

  • Intercalating agents (mutagens)

  • Intercalating agents fit between adjacent bases and distort the helix structure; causes single-nucleotide insertions and deletions in replication 

  • Ioninizing radiation and UV light (non-ionizing) - mutagens 

  • Ionizing radiation have high energies and can penetrate tissues, dislodge electrons, and cause double-stranded breaks 

    • Double-stranded breaks must be repaired for the cell to survive and replicate

    • Can introduce various mutations due to double-stranded break repair 

  • UV light has less energy and cause adjacent pyrimidine bases to form covalent bonds; distortion to helix

    • Often is T-T (thymine dimers) 

    • Must be repaired or block replication/transcription

  • DNA repair 

  • Plays a large role in how mutagens and spontaneous DNA changes are managed in the cell

  • 4 major replication mechanisms: mismatch repair, direct repair, base-excision repair, nucleotide-excision repair

  • Replication proofreading and various repair mechanisms have essentially evolved to minimize these mutations

    • (ex.) mismatch repair: after replication; deals with template strand DNA differences with newly synthesized strand

      • In bacteria: utilizes DNA methylation of adenine on template strand to differentiate between strands and presume the new strand is falsely incorporated

      • In humans: differentiated by ‘nicks’ present in the old vs new templates during synthesis 


— 4.2 — DNA Damage and Repair Review

  • DNA damage: any unintended physical or chemical changes to DNA

  • DNA molecules are permanent unlike transient RNA changes

  • Can cause impaired cell function, cell death, cancer (more commonly harmful) 

  • Living organisms’ cell goal: preserve existing DNA sequence

    • Cells have evolved repair mechanisms to correct and repair damaged DNA

  • Copying mistakes, depurination, deamination, pyrimidine dimers, strand breaks (spontaneous)

  • Copying mistakes: DNA polymerase error one every ~10^6-7 nucleotides in us 

  • DNA polymerase may incorporate the wrong base into strand (mismatch), extra bases (insertion), skip base (deletion) 

  • Results in change in helix structure; may also have functional consequences 

  • Depurination: loss of A/G base

  • Spontaneous hydrolysis of adenine or guanine base, resulting in abasic site with no base but sugar-phosphate backbone remains intact

  • Blocks DNA replication; overcome by translesion DNA polymerases recruited past the site of damage → make mistakes more frequently and will delete base/ add random base to abasic site

  • Deamination: conversion of amine (NH2) to carbonyl

  • Most common at C to produce uracil

  • Change does not hinder DNA replication but may affect protein/gene regulation 

  • Pyrimidine dimers: UV light can cause formation of 4-membered carbon ring between adjacent pyrimidines

  • Most common at adjacent T-T residues 

  • Block DNA replication and overcome by translesion polymerases past damage site 

  • Other base modifications (ionizing radiation, chemical mutagens)

  • Large numbers of reactive compounds can modify structure and function of DNA bases 

  • Strand breaks 

  • Ionizing radiation, mechanical stress, single/double stranded

  • Can result in incomplete replication or chromosomal rearrangements


  • DNA repair systems 

  1. Proofreading during DNA replication of DNA polymerase

  2. Mismatch repair → repairs replication mistakes 

  • In bacteria: most bacteria methylate certain DNA bases; mature DNA methylated on both strands but addition of methyl group takes time → scanning DNA in either direction to search for single-stranded methylation site → new strand is missing methyl group 

    • Endonuclease cuts out the site

    • Exonuclease digests DNA from newly synthesized end from site of methylation to site of mismatch 

    • DNA polymerase fills gap using intact strand as synthesis template; ligase seals 

  • In eukaryotes: mismatch repair mechanism not fully understood but exists to reduce error rate 

  1. Direct repair → repair of specific modified bases 

  • Ex. methyltransferase only repairs one type of damage 

  1. Base excision repair (BER) → fixes ‘localized’ damage 

  • Usually affecting only a single base

  • Repairs cytosine deamination, modified bases, abasic sites, single stranded breaks 

  • Can be SHORT or LONG-patch 

  1. Short patch → default pathway (SINGLE base)

  • Breaking of N-glycosidic bond joining base to deoxyribose and removes damaged DNA to form abasic site 

  • Endonuclease cuts sugar-phosphate backbone if double stranded break

  • Free 3’ hydroxyl allows repair DNA polymerase to use undamaged strand as template and sealed by ligase

  1. Long patch → longer stretch of DNA replaced if necessary 

  • Replaces up to 10 bases on damaged strand 

5. Nucleotide excision repair (NER) → repairs pyrimidine dimers and double-helix disrupting damage 

  • Helicase unwinds strands around damage site

  • Endonuclease cuts strand about 30 bases apart and removes nucleotides in  between all at once 

  • Repair DNA polymerase uses 3’ end on one side of the damage to synthesize replacement DNA and sealed by ligase 


— 4.3 Transposons 

Distinguish Type 1 and 2 transposons, identify mutations caused by transposons, describe general structure of transposon, explain general mechanism used by transposons 

  • Dr. Barbara McClintock 

  • Transposons and corn kernels → observed kernels may have variable phenotypes where kernels can be both pigmented and unpigmented

  • Transposons responsible for pigment formation → genomes are not static 

  • Transposons 

  • Noncoding DNA present in many genomes 

  • Can induce a DNA break and copy themselves into a new locus 

  • “Selfish” or “parasitic” DNA ⇒ analogous to copying action of retrviruses 

  • Transposable elements are one of the most prevalent forms of non-coding DNA in human genome

  • Consequences of transposition

  1. Massive increase in genome size

    1. Mass mobilization of transposons is suppressed in most organisms to mitigate detrimental effects 

    2. ‘Selfish’ nature of transposons has co-evolved with transposition-suppression mechanisms 

  2. Splice into genes, disrupting function 

    1. Transposons are scattered in the genome; if transposon inserted into coding sequence → disruption → inactive gene 

  3. Altered gene expression rather than turning off

    1. Can insert into upstream regulatory sequences to decrease expression of the gene 

    2. Transposons may contain their own regulatory sequence, affecting expression of adjacent genes 

  4. Large-scale genome rearrangements 

    1. Transposons are relatively large and often contain the exact same DNA sequence 

    2. If sequences are copied, they become homologous and can pair to recombine and rearrange the genome 

  • Orientations of transposons 

  • DNA is double-stranded and run antiparallel; direct or inverted orientation possible 

  • Orientations have an impact on the type of potential genome rearrangements 

  • Direct orientation: sequence is present in the same strand for a transposon 

  • Inverted orientation: sequence of transposon is present in 2 different DNA strands and run antiparallel to one another 

  • Genome rearrangements 

  1. Deletion 

    1. Direct orientation transposons present on the same chromosome; ‘flanking’ gene regions 

    2. If pairing and recombination occurs ⇒ splicing out of everything in between

    3. Resulting chromosome has 1 transposon and ‘exterior’ genes left

  2. Inversion 

    1. Inverted orientation transposons pair with one another to obtain an inversion of genes in between (ie. ‘folding over’ of the same chromosome segment)

    2. Results in chromosome inversion where the genes in between now are ‘flipped’ in the opposite orientation 

  3. Intermolecular interactions between different chromosomes 

    1. Two copies of the same chromosome where pairing is misaligned 

    2. Transposon 1 pairs with transposon 2 on 2 separate chromosomes

    3. Will obtain ‘swapping’ of chromosomal fragments where both chromosomes contain fragments of one another ( ⇒ deletions, duplication) 

  4. Direct orientation of DIFFERENT chromosomes (translocations) 

  1. Homologous pairing of the two directly-oriented chromosomes lead to translocations

  2. Segments of both chromosomes swapped leading to ‘chromosomal translocations’



  • Mechanisms of transposition

  • Broken into 2 types: Type I, Type II transposons 

  • Involves duplication of target sequence 

  • Type II transposons (DNA transposons)

  • Copied and replicated as DNA and remain as DNA throughout transposition mechanism 

  • Can be either ‘copy-and-paste’ or ‘cut-and-paste’ (ie. replicative vs non replicative transposons)

  • Type I (involve RNA intermediate) 

  • ‘Retrotransposons’; mechanism involves an RNA intermediate using reverse transcriptase enzyme 









| Mechanisms of transposition | 

  1. Type 2: 

    1. Insertion site is flanked by terminal inverted repeats (TIR) 

    2. Transposase enzyme delivers staggered cuts to DNA where the transposon will insert into the target region  (transposase enzyme unique to Type 2)

    3. Staggered end gaps are filled in by DNA polymerase 

    4. Filling in results in ‘flanking direct repeats’ involving duplication of the target sequence 

    5. FDRs are not a part of transposable element and don’t travel with them (are a  consequence of insertion) 

    6. Once transposon has been spliced out, a transposase recognizes terminal inverted repeats and excises a TE from its locus 

    7. Results in ‘flanking direct repeat’ left behind as a result 

      1. Type 2 (replicative) 

        1. Transposon is copied and pasted into a new locus within a chromosome 

        2. Original transposon replicated into a new insertion site 

      2. Type 2 (cut-and-paste)

        1. Transposon is excised and reinserted at a different point in the genome

        2. Leaves a double-stranded break which is repaired, allowing reinsertion at a new locus 




















  1. Type 1: 

    1. Retrotransposons utilize an RNA intermediate; eukaryote-specific 

    2. Can be long-terminal direct repeats (LTR) or non-LTR (recall: type 2 contains inverted repeats in contrast to LTR of type 1)

    3. Transposon at the DNA level when inserted ⇒ (1) copied into mRNA intermediate ⇒ (2) reverse transcriptase used to make a DNA copy of mRNA ⇒ (3) DNA copy can be inserted into a new cut site via staggered cut 

      1. DNA → RNA → RT DNA → new locus insertion 

    4. Will always occur via copy-and-paste mechanism 

      1. We know this because of similarity to retroviral insertion mechanisms (common ancestry) 

      2. Retroviruses can insert its RNA genome + reverse transcriptase ⇒ used to synthesize a cDNA strand ⇒ cDNA insert into host nuclear DNA for replication 

  • Retrotransposons (T1) and yeast experiment 

  • Retrotransposon from yeast was utilized to insert an intron to determine whether this was RNA or DNA-level 

    • If transposed via DNA intermediate, intron would remain following transposition

    • If transposed via RNA intermediate, intron would be spliced out following transcription 

  • Deduced that intronic sequence was spliced out and new retrotransposon insertions did not contain intron


  • Summary of transposition mechanisms: 

  • FDRs arise from transposon insertion; present at both T1 and T2 insertion sites 

  • TIRs are unique to type 2 

  • LTRs are unique to type 1 

  • “Copy and paste” mechanism of T1 and T2 allows to increase genome size and noncoding DNA 

  • “Cut and paste” mechanism of T2 has no effect on genome size 



— 4.4 Genome-wide Association Studies 

Application of GWAS to horses, dogs, and humans and degree of variation present in any species

  • Recall: genomes are not stable! 

  • In a growing population of genomes with no selection pressure, total alleles increase with each generation

  • Single nucleotide polymorphisms (SNPs) 

  • Are genetic markers and can be different across homologous chromosomes

  • SNPs are close to one another; rare that a recombination event would separate them and exist in linkage disequilibrium 

    • Linkage disequilibrium: tends to segregate together; low chance of exchange between the homologous chromosomes

  • Diploid organisms have 2 haplotypes; one from either homologous chromosome

  • Mutation in SNP

  • Given a specific phenotype-differentiating mutation in a SNP, we would assume the mutational change would be found within a group of affected individuals

    • Assume the allele would not be concentrated within the group; only associated 

    • Would expect the haplotype of the affected group to be associated with the phenotype caused by mutation 

    • ⇒ because SNPs are close to one another resulting in linkage disequilibrium, the cause of the phenotype must be near SNPs 1-4 or one of the four. 

  • Genome-wide association study 

  • Aims to identify the SNPs associated with the phenotype

  • For large datasets, the Manhattan Plot is a method of visual representation 

  • Y-axis: probability that an association is NOT random; higher y-value corresponds to less likely due to random change

  • X-axis: chromosome positioins of individual SNPs from largest ⇒ smallest ⇒ x chromosome 

  • Strength of association represented by chance of randomness

  • Patterns: 

    • SNPs not associated tend to NOT be in linkage disequilibrium and instead randomly associate with the phenotype; low y-value

    • SNPs associated with the phenotype tend show a stronger association the closer to the gene 


| Selective breeding - horses and dogs | 

  • Horse breeds and SNP association

  • Through generations of selective breeding following domestication, we obtain various horse breeds with specific traits 

  • Original horse gene pool had enough variation to provide the framework for 300+ distinct horse breeds today 

  • More SNPs = more distantly related, less SNPs = more closely related

  • GWAS has disrupted horse breeding 

    • traditional breeding relies on luck and ‘dilution’ of good genes will consequently occur with each generation’s progeny 

    • Identifying major gene effects through genomic analyses allows for direct identification of potential winners

  • The thoroughbred horse 

  • We can trace back the origins of the thoroughbred horse to the original 3 horse ancestors of the breed that were then interbred

  • One contributing factor to the speed of thoroughbred horses is a sequenced region of the horse genome

    • The gene important for speed encodes myostatin

    • Myostatin protein suppresses muscle development; slow horses have high levels of myostatin expression whereas fast horses have lower levels of expression and more muscle 

    • Insertion of the SINE transposon into the promoter region of the gene reduces the expression of myostatin ⇒ faster horse and more jacked 

  • Dog breeds and variation 

  • High variation in dog breeds exist within the same species 

  • Ancient and spitz breed progenitor to modern dog breeds; phylogeny can be determined by identifying SNPs and grouping them 

  • Conducting association studies can characterize individual phenotypes

    • (ex. ‘Furnishing’ phenotype is RSPO2, fur length is FGF5) 

  • Many genes responsible for genetics of height and size are also responsible for longevity 



  • Human evolution 

  • Homo erectus ⇒ homo heidelbergensis ⇒ neanderthals, denisovans, homo sapiens 

  • Today many individuals of European and Asian descent still possess Neanderthal and Denisovan haplotypes 

  1. Believed that homo erectus/heidelbergenesis migrated out of Africa ~300 000 years ago 

  2. Homo heidelbergenesis descended into Neanderthals (Europe) and Denisovans (Asia)

  3. Rise of Denisovan and Neanderthal haplotypes occurred after they migrated, accumulating mutations through generations 

  4. When homo sapiens later migrated out of Africa 50 000 years ago, they interbred with Neanderthals and Denisovans (they are not distinct species)

  • Phenotypic classification of ‘species’ is not an adequate method of classification; lots of variation is permitted within us 

  • COVID-19 study 

  • Identified 2 groups who were respectively severely affected by infection and those who were not 

  • When Manhattan Plot was constructed, strong linkage disequilibrium could be identified in the third chromosome and severity of infection 

  • Severe COVID 19 infection was associated with the neanderthal haplotype; interbreeding between Homo Sapiens and Neanderthals can still present its effects in modern-day


0.0(0)
robot