DNA Cloning and Sequencing Notes
Cloning DNA Fragments/Genes & Recombinant DNA Technology
- Recombinant DNA technology uses in vitro molecular techniques to isolate and manipulate DNA fragments.
- This technology is used to find genes in genomes.
Cloning
- Cloning is the generation or production of identical copies of molecules (e.g., DNA), cells, or organisms.
- Gene cloning is the technique of isolating and making many copies of a gene.
Cloning of DNA
- Involves:
- Restriction endonucleases
- Vectors
Restriction Endonucleases
- Enzymes that recognize specific sequences in double-stranded DNA (dsDNA) and cleave the DNA.
- In the 1960s, it was discovered that certain E. coli hosts cleave phage DNA, restricting phage growth.
- Example: HindII from Haemophilus influenzae cleaves T7 phage into 40 specific fragments.
Restriction Enzymes and Genome Fragmentation
- Each restriction enzyme recognizes a specific sequence of bases anywhere within the genome.
- They cut the sugar-phosphate backbones of both strands.
- Restriction fragments are generated by digestion of DNA with restriction enzymes.
- Hundreds of restriction enzymes are now available.
- Recognition sites are usually 4-8 base pairs (bp) of double-stranded DNA.
- Often palindromic: base sequences of each strand are identical when read 5'-to-3'.
- Example: 5'-GAATTC-3' and 3'-CTTAAG-5'
- Each enzyme cuts at the same place relative to its specific recognition sequence.
Commonly Used Restriction Enzymes
| Enzyme | Sequence of Recognition Site | Microbial Origin |
|---|
| TaqI | 5'-TCGA-3' | Thermus aquaticus YT1 |
| RsaI | 5'-GTAC-3' | Rhodopseudomonas sphaeroides |
| Sau3AI | 5'-GATC-3' | Staphylococcus aureus 3A |
| EcoRI | 5'-GAATTC-3' | Escherichia coli |
| BamHI | 5'-GGATCC-3' | Bacillus amyloliquefaciens H. |
| HindIII | 5'-AAGCTT-3' | Haemophilus influenzae |
| KpnI | 5'-GGTACC-3' | Klebsiella pneumoniae OK8 |
| ClaI | 5'-ATCGAT-3' | Caryophanon latum |
| BssHII | 5'-GCGCGC-3' | Bacillus stearothermophilus |
| NotI | 5'-GCGGCCGC-3' | Nocardia otitidiscaviarum |
Restriction Enzyme Cuts: Blunt vs. Sticky Ends
- Blunt ends: Cuts are straight through both DNA strands at the line of symmetry.
- Sticky ends: Cuts are displaced equally on either side of the line of symmetry.
- Ends have either 5' overhangs or 3' overhangs.
- Arber, Nathans, and Smith were awarded the 1978 Nobel Prize for their discovery of restriction enzymes.
Restriction Fragment Length Variation
- Different restriction enzymes produce fragments of different lengths.
- Average fragment length is 4n, where n is the number of bases in the recognition site.
- 4-base recognition site occurs every 44 bp (256 bp).
- With a 3 billion bp genome, this yields ~12 million fragments (3×109/256=12×106).
- 6-base recognition site occurs every 46 bp (4096 bp or 4.1 kb).
- With a 3 billion bp genome, this yields ~700,000 fragments (3×109/4100=7×105).
Cloning Vector
- A plasmid or virus used to carry inserted foreign DNA and replicate it, producing more copies of the foreign DNA or its protein product.
Creating Recombinant DNA with Plasmid Vectors
- Plasmid cloning vectors have three main features:
- Origin of replication
- Restriction site(s) for cloning insert DNA
- A selectable marker (e.g., antibiotic resistance)
- Note: Bacterial artificial chromosomes (BACs) and yeast artificial chromosomes (YACs) are alternate cloning vectors that can carry large inserts.
Cloning DNA Fragments: Basic Steps
- Two basic steps:
- Insert DNA fragments into cloning vectors to make a recombinant DNA molecule.
- Transport recombinant DNA into a living cell to be copied.
Creating Recombinant DNA Molecules with Plasmid Vectors (cont.)
- Digestion of the vector and human genomic DNA with a restriction enzyme results in complementary sticky ends.
Creating Recombinant DNA Molecules with Plasmid Vectors (cont.)
- Ligase is used to seal the phosphodiester backbones between vector and insert.
- Paul Berg was awarded the 1980 Nobel Prize for his work on gene splicing.
Molecular Cloning: Host Cells and Amplification of Recombinant DNA
- In E. coli, only 0.1% of cells will be transformed with the plasmid.
- Selection: Only cells with the plasmid will grow on media with ampicillin.
- Each cell multiplies to produce millions of genetically identical cells, each with a recombinant plasmid.
- Transformation: The process by which a cell or organism takes up foreign DNA.
Plasmid Screening for Insert-Containing DNA
- Plasmid carries the lacZ gene, which encodes β-galactosidase (β-gal).
- Insertional inactivation: Restriction site for cloning insert DNA is located in the middle of lacZ.
- Screen for β-gal activity:
- Plasmid without insert will have an intact lacZ.
- Plasmid with insert will have a disrupted lacZ.
Molecular Cloning: Distinguishing Recombinant Molecules
- Using a screen to distinguish cells carrying recombinant molecules from cells carrying non-recombinant molecules.
- X-gal is a substrate for β-gal and is converted to a blue pigment.
- Intact lacZ → blue colony
- Disrupted lacZ → white colony
- IPTG (isopropyl-β-D-thiogalactopyranoside) is a lactose analog that can induce lacZ gene expression.
- X-gal = 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside
Expression Vectors
- Allow transcription and translation of introduced genes.
Polymerase Chain Reaction (PCR)
- Developed in 1983 by Karry Mullis.
- Allows making billions of copies of DNA from a few copies.
- Mullis shared the Nobel Prize in 1993.
PCR for DNA Amplification
- PCR generates copies of target DNA.
- It is a faster, less expensive, and more flexible way to amplify specific DNA fragments than molecular cloning.
- Extremely efficient: can amplify DNA from a single cell or from some archaeological samples.
- Oligonucleotides are designed from previously known DNA sequences and serve as primers for DNA synthesis.
- The target sequence located between primer sequences is exponentially amplified by 25-30 cycles of DNA synthesis.
PCR Primers
- Two oligonucleotide primers (16-26 nucleotides) are needed for PCR reactions.
- The region between the two primers will be synthesized.
- One primer is complementary to one strand of DNA at one end of the target region.
- The other primer is complementary to the other strand of DNA at the other end of the target region.
PCR Cycle Steps
- Denaturation:
- Initial: 94∘C for 5 minutes (first round)
- Subsequent: 94∘C for 20 seconds
- Annealing:
- 50−60∘C for 2 minutes
- Primers base pair at sites flanking target sequence of genomic DNA
- Extension/Polymerization:
- 72∘C for 2-5 minutes
- Polymerization from primers along templates
PCR Cycle Steps (Described)
- (1) Denature strands
- (2) Base pairing of primers
- (3) Polymerization from primers along templates
Exponential Amplification in PCR
- Illustrates exponential increase in target DNA: 1 → 2 → 4 → 8 → 16 → 32 copies, etc.
Uses for PCR
- Genotype detection
- Analysis of traces of partially degraded DNA
- Evolutionary studies:
- Compare homologous sequences from related organisms.
- Compare sequences from a variety of sources.
- Studies of gene diversity
- Diagnosis of infectious diseases (detect bacterial and viral infection)
- Forensics - Amplify DNA for analysis
- Sex determination
DNA Sequence Analysis
- Two methods originally developed in 1977:
- Maxam-Gilbert method: Chemical cleavage of DNA at specific nucleotides.
- Sanger method: Enzymatic extension of DNA strands to a defined terminating base.
- Both methods can determine the sequence of 500-700 bp per reaction with 99.9% accuracy.
- The Sanger method is more amenable to automation.
- Paul Berg, Walter Gilbert, and Frederick Sanger shared the 1980 Nobel Prize for recombinant DNA and DNA sequencing.
Sanger Sequencing: Nested Fragments
- Sanger sequencing generates sets of nested fragments separated by size.
- Two steps:
- Generate a complete series of complementary single-stranded subfragments from a template DNA.
- Each subfragment differs in length by a single nucleotide.
- Identify the terminal nucleotide in each subfragment.
- Polyacrylamide gel electrophoresis:
- Separates DNA molecules that differ in length by one nucleotide.
Sanger Sequencing Requirements
- DNA polymerase requires:
- Template: A single strand of DNA to copy.
- Deoxyribonucleotide triphosphates (dATP, dCTP, dGTP, dTTP).
- Primer: Short single-stranded DNA molecule that is complementary to the template.
Sanger Sequencing Template
- A recombinant plasmid is a good template for Sanger sequencing.
- The primer is designed to be complementary to the plasmid sequence adjacent to the unknown insert sequence.
- Template and primer interact through hybridization.
Sanger Sequencing: Chain Termination
- Incorporation of a dideoxynucleotide (ddNTP) terminates DNA synthesis.
- dNTPs and ddNTPs are used during DNA synthesis.
- ddNTPs lack a 3' -OH group, preventing further extension.
Sanger Sequencing: Fragment Generation
- Sanger sequencing generates a series of single-stranded DNA fragments.
- DNA fragments include the primer and nucleotides complementary to the unknown DNA.
- The DNA fragments are a nested array—they each differ in length by one nucleotide.
Sanger Sequencing: Electrophoretic Separation
- Nested fragments are separated by size using electrophoresis.
- A special gel can separate DNA fragments that differ in size by only one nucleotide.
- Smaller DNA fragments migrate quickly and appear at the bottom of the gel.
Automated DNA Sequencing
- After electrophoresis, fragments flow through a detector, and the color of the fragment is digitally recorded.
Automated Sanger Sequencing: Detection
- Each ddNTP is labeled with a different fluorescent dye for detection of the sequence.
- Each lane displays the sequence obtained from a separate DNA sample and primer.
- Each fragment has terminated with a specific ddNTP labeled with a specific fluorescent dye.
DNA Sequence Trace
- Computer reads the sequence complementary to the template strand.
- Sequence is read from left to right (5'-to-3' synthesis from primer).
Next-Generation Sequencing (NGS) Technologies
- Illumina
- Pacific Biosciences
- Oxford Nanopore sequencing
Illumina Sequencing
- Sequencing by synthesis (SBS).
- Sequence billions of fragments simultaneously (massively parallel sequencing).
- No need to clone DNA fragments into a vector.
- Very small amounts of reagents (enzyme and nucleotides) are used.
Illumina Patterned Flow Cell
- Advantages:
- Higher data output
- More sequencing reads
- Faster run times
Human Genome Library Construction for Illumina
- Shear high molecular weight DNA with sonication.
- Enzymatic treatments to blunt ends.
- Phosphorylation.
- Addition of A-overhang.
- Ligation to adapters (each with a DNA barcode).
- PCR amplify.
- Quantitate library.
Illumina Sequencing Process
- First chemistry cycle: Add all four labeled reversible terminators, primers, and DNA polymerase enzyme to the flow cell to initiate the first sequencing cycle.
- After laser excitation, capture the image of emitted fluorescence from each cluster on the flow cell. Record the identity of the first base for each cluster.
- Second chemistry cycle: Add all four labeled reversible terminators and enzyme to the flow cell to initiate the next sequencing cycle.
Illumina Sequencing Chemistry
- Incorporate labeled nucleotide.
- Detect signal.
- Deblock 3' end.
- Cleave fluorophore.
Illumina Sequencing - Multiple Cycles
- After laser excitation, collect the image data as before. Record the identity of the second base for each cluster.
- Repeat cycles of sequencing to determine the sequence of bases in a given fragment, a single base at a time.
Ultrahigh-Throughput DNA Sequencing
- 2008 - New generation DNA sequencing methods.
- 1800 GB pairs of sequence can be determined in a single experiment.
Long-Read Sequencing
- Long reads – Over 2 Mb
- Direct RNA sequence
- Error rate high
- MinION 500 pores
- PromethION – 3000 pores, up to 4 Tb in 2 days.
- Bioinformatics provides tools for visualizing functional features of genomes.
- Bioinformatics is the science of using computational tools to decipher biological information.
- 1988 – National Center for Biotechnology Information (NCBI) established.
- Oversees GenBank
- Created additional public databases of biological information
- Developed bioinformatic tools for analyzing, systemizing, and disseminating the data
- RNA-sequencing (RNA-seq)
- Determines whether a gene is transcribed and how much RNA is present corresponding to each gene.
- Used for basic research to understand gene regulation and applied biomedical research to identify genes that are misregulated in tissues/individuals with diseases and in specific mutants.
- Transcriptome: The set of all RNA molecules that are transcribed in a cell, tissue, or organ.
RNA Sequencing (RNA-seq)
- RNA-Seq is a newer method to identify expressed genes and their level of expression.
- It has several important applications in comparing transcriptomes - the set of all RNA molecules, including mRNAs and non-coding RNAs, that are transcribed in one cell or a population of cells.
- RNA-Seq is used to compare transcription in:
- Different cell types
- Healthy vs. diseased cells
- Different stages of development
- Response to different environmental agents such as hormones or toxic chemicals
RNA Sequencing (RNA-seq) Process
- Isolate RNA from a sample of cells. May focus on a subpopulation of RNA, such as mRNAs or short non-coding RNAs.
- Break the RNAs into small fragments.
- Attach short oligonucleotide linkers to the ends of the RNAs.
- Synthesize cDNAs via reverse transcriptase PCR, using the RNAs as templates. The PCR primers are complementary to the linkers.
- Sequence the cDNAs using a next-generation sequencing technology.
- Using computer technology, align the cDNA sequences along the genomic sequence.
RNA-Seq Advantages
- More accurate at quantifying the amount of each RNA transcript.
- Superior at detecting RNA transcripts that are in low abundance.
- Identifies the exact boundaries between exons and introns; identifies new splice variants.
- Identifies the 5’ and 3’ ends of RNA transcripts.