Sanger Sequencing Notes

Sanger Sequencing: The Chain Termination Method

Introduction to Sanger Sequencing

  • In 1977, Fredrik Sanger introduced a DNA sequencing method using chain-terminating inhibitors.

  • The method aimed to determine the nucleotide sequence of a DNA fragment and became known as Sanger sequencing.

  • Chain-terminating inhibitors are also referred to as dideoxynucleoside triphosphates (ddNTPs).

Understanding DNA and Nucleotides (DNTPs)

  • DNA consists of a chain made up of four different nucleotides called DNTPs.

  • DNA polymerase adds complementary nucleotides to copy and extend the DNA double strand.

  • DNTP: Deoxyribonucleoside triphosphate, composed of deoxyribose, a base, and a triphosphate group.

    • A nucleoside is a ribose sugar combined with a base.

    • The base can be guanine (G), cytosine (C), thymine (T), or adenine (A).

    • Deoxyribose is similar to ribose but has one less oxygen atom.

Dideoxynucleoside Triphosphate (ddNTP)

  • ddNTP: Dideoxyribonucleoside triphosphate. Features two fewer oxygen atoms than ribose.

  • DNA polymerase function: Catalyzes the addition of new bases to a growing DNA strand.

    • The phosphate group of an incoming DNTP reacts with the ribose oxygen of the bound DNTP.

    • This reaction releases two phosphate groups and adds the new dNTP to the strand.

  • If a ddNTP is incorporated, the absence of a ribose oxygen prevents further dNTP addition, thus terminating the chain.

Naming Conventions: 5' and 3' Ends

  • 5' (five prime) and 3' (three prime) refer to the carbon atom positions on deoxyribose in dNTP.

  • Carbons are numbered from the one linked to the base to the phosphate.

  • The oxygen required for adding new dNTPs is bound to the 3' carbon, so DNA extends from the 3' end.

  • Triphosphate is bound to the 5' carbon, marking the start, while the 3' end is the finish.

  • DNA sequence is always written in the 5' to 3' direction.

  • DNA polymerase only adds a complementary base to the template DNA: C pairs with G, and A pairs with T.

Original Sanger Sequencing Method

  • The original Sanger sequencing began as a manual method using radioactive dyes.

  • Components Needed:

    • Primer

    • DNA polymerase

    • dNTPs

    • DNA template

    • ddNTPs

  • One of the dNTPs, dATP, was labeled with a radioactive tag.

  • Four tubes were used, one for each ddNTP.

Procedure
  1. DNA, primer, and buffer are heated to 100C100^\circ C to separate the DNA into single strands (before PCR).

  2. The mixture is cooled to 67C67^\circ C to allow primers to bind.

  3. DNA polymerase, all four dNTPs, and one ddNTP are added to each tube.

  4. DNA polymerase extends the DNA template.

  5. The strand terminates upon incorporating a ddNTP.

  6. ddNTPs are at lower concentrations than dNTPs, making incorporation random.

  7. Termination occurs at each base, creating fragments of varying lengths.

  8. All fragments in a tube start with the same primer sequence and end in the same nucleotide.

  9. Low ddNTP concentration allows sequencing of up to 200 nucleotides.

Gel Electrophoresis
  • The four sequencing reactions are mixed with a loading dye.

  • Each reaction is loaded into a separate lane of a polyacrylamide gel.

  • Fragments migrate through the gel based on size (smaller fragments move faster).

  • This gel type can differentiate fragments differing by a single nucleotide.

  • A loading dye indicates when fragments reach the end of the gel.

  • The gel is dried onto a paper support, and X-ray film detects radiation from the dATPs in the fragments.

  • Bands appear, showing each fragment's position.

Base Calling
  • Base calling is reading the DNA sequence.

  • DNA is read from 5' to 3', starting with the shortest fragment.

  • The position of each fragment indicates the terminating nucleotide, which reveals the sequence.

  • Example: If the shortest fragment is in the ddTTP lane, the first nucleotide is T; if the next shortest is in the ddGTP lane, it's G, and so on.

  • Example sequence reads: TGCATGCCA.

Limitations of the Original Method
  • Labor-intensive.

  • It took four days to sequence 200 nucleotides from a few samples.

Advancements & Automation

  • 1987: Applied Biosystems introduced the AB370A, the first commercial sequencing instrument.

  • Fluorescent dyes replaced radioactive dyes, improving safety and cutting out X-ray film detection time.

    • Fluorescent sequencing primers were used. Each of the four ddNTP reactions was labeled with a different colored fluorescent dye.

    • After sequencing, four reactions could be mixed and loaded in one gel lane.

    • A laser scanned the gel bottom, detecting fragments as they passed and feeding the data into a computer to automate base calling.

    • The AB370A could run 16 samples per gel, with a read length of 450 nucleotides.

Human Genome Project

  • 1990: The U.S. government announced the Human Genome Project to map and sequence all genes in the human genome.

  • By 1990, less than 2% of the human genome had been sequenced.

  • Sequencing the genome promised to identify disease-causing and associated genes for treating genetic diseases.

PCR & Cycle Sequencing

  • 1983: Kary Mullis invented PCR.

  • 1989: Vincent Murray used Taq polymerase for Sanger sequencing.

  • In Sanger sequencing, the primer binds to the DNA, and DNA polymerase extends the fragment.

  • With Taq polymerase, DNA can be melted after the first extension and survive the high heat.

  • Cooling allows another sequencing primer to anneal, repeating melting, annealing, and extension in cycles (linear PCR).

  • More primers incorporated into fragments enhance the fluorescent signal.

  • Only forward strands are made, increasing fragments linearly over cycles.

  • This method was termed cycle sequencing.

  • The higher fluorescent signal required less DNA for each reaction.

Capillary Electrophoresis

  • A small amount of gel is contained in a fine tube.

  • DNA is drawn in one end, runs through the gel via electric current, and is detected by a laser at the other end.

  • The fine tube allows heat to escape, meaning higher currents can be used without overheating the gel.

  • Higher currents accelerate runtime and improve resolution.

  • 1989: Beckman Coulter launched the first commercial capillary electrophoresis instrument.

  • This paved the way for capillary-based Sanger sequencing systems like the ABI PRISM 310.

ABI PRISM 310

  • 1995: Applied Biosystems launched the ABI PRISM 310, marking the birth of modern Sanger sequencing.

  • This system used one capillary for electrophoresis instead of a PAGE gel.

  • One sample could be analyzed in under three hours, compared to 14 hours.

  • Sequencing read length improved to 600 base pairs.

  • Automated sample loading enabled up to 96 samples to be loaded and run unattended.

  • Electrokinetic injection used electrical current to concentrate and load DNA into the capillary.

  • Fragments separated by size through the gel and passed by a laser for size and color detection.

  • Software detected and called the bases.

  • Sequencing was initially carried out with fluorescent primers because they provided even peak heights.

  • Labeled ddNTPs couldn't achieve this even peak height until the introduction of BigDye terminators in 1997.

  • Fluorescent primers required four reactions, while fluorescent ddNTPs allowed sequencing reactions to occur in one tube.

ABI PRISM 3700 & Celera Genomics

  • Applied Biosystems continued improving its systems, driven by growing demand for automation.

  • 1998: Applied Biosystems launched the ABI PRISM 3700, with 96 capillaries.

  • Applied Biosystems partnered with The Institute of Genome Research (TIGR), headed by Craig Venter.

  • They formed Celera Genomics, aiming to sequence the human genome faster than the public Human Genome Project.

  • Celera planned to profit by selling access to its sequenced data and patenting useful genes, sparking controversy.

Impact of ABI PRISM 3700
  • The ABI PRISM 3700 played a significant role in sequencing the human genome.

  • Each 96-sample run took less than 2.5 hours and generated 800 base pairs of sequence per sample.

  • A technician sequenced 15361536 samples daily with only 1515 minutes of hands-on time.

  • The instrument reduced the cost per base of sequencing.

  • Celera produced a draft human genome sequence in three years, publishing results in 02/2001.

  • The Human Genome Project, also aided by ABI PRISM 3700, published its draft genome simultaneously.

Sanger Sequencing Today

  • This modified Sanger sequencing method is still used despite newer technologies like next-generation sequencing (NGS).

Comparison with Next-Generation Sequencing (NGS)
  • Sanger sequencing remains the gold standard due to its high (99.9%) base-calling accuracy.

  • NGS accuracy ranges from 99% to 99.9%, depending on sequencing depth.

  • Sanger sequencing is more cost-effective and faster for under 20 samples.

  • NGS is more cost effective for a large number of samples.

  • Sanger sensitivity to detect a base within background DNA is 15% to 20%, compared to NGS with 1% sensitivity.

  • Sanger has low sample coverage of one read per sample of 300-850 base pairs.

  • NGS can generate billions of reads per sample, up to 16 terabytes, enabling 128 human genomes to be sequenced in one run.

Conclusion
  • If you have less than 20 samples or genes to sequence, Sanger sequencing is the preferred method.