Lecture 10 - DNA Sequencing, Genome Assembly, & DNA Diagnostics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/27

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

28 Terms

1
New cards

What is Sanger (ddNTP Terminator) Sequencing?

a DNA sequencing technique that determines nucleotide order by using fluorescently labelled dideoxynucleotides (ddNTPs) to randomly stop DNA chain elongation during replication, creating DNA fragments of different lengths, which are then separated by size (electrophoresis) to reveal the sequence by reading the colour of the terminating base

2
New cards

What is Automated Sanger Sequencing?

  • a modern method that uses fluorescently labelled dideoxynucleotides (ddNTPs) and capillary electrophoresis to determine a DNA sequence

  • Unlike the original method, all four fluorescently tagged ddNTPs are combined in a single reaction tube, and the resulting fragments are separated by size in a capillary tube

  • A laser excites the fluorescent dyes, and a computer translates the emitted light into a sequence, displayed as a chromatogram

3
New cards

What is the Human Genome Sequence Project?

  • Public: Directed for final push by Francis Collins (Nature) 13yrs

  • Private: Led by Celera Genomics, run by J Craig Ventor (Science) 3yrs

  • Sequenced only the euchromatic regions (not including sequences around centromeres and telomeres)

  • Sequencing to finish the genome continued over the next 5 years; declared complete 2003, with last chromosome published in 2006

4
New cards

What is Genome Assembly - Hierarchical Shotgun Sequencing?

  • Cloned genomes

  • The genome is divided into large segments of known order

  • Ordered genome segments

  • Multiple genome portions are shared into variable-sized segments

  • Unordered sequenced segments

  • Computational automated assembly

  • Resulting overlapping sequenced segments

  • Overlapping sequence segments are combined to construct the genome consensus

5
New cards

What are the 3 improvements of Next-Generation (NGS) Sequencing?

  1. Libraries are prepared in a cell free ssytme instead of being cloned into vectors that are transformed into bacteria

  2. Millions-to-billions of sequencing reactions are carried out in parallel instead of running hundreds at a time as in automated Sanger sequencing

  3. The sequencing output is ‘read’ directly without the need to be separated by capillary electrophoresis

  • The interrogation of bases is performed cyclially and in parallel - this produces enormous numbers of reads, allowing genomes to be sequenced at great speed

  • A drawback of NGS technologies has been their relatively short reads - this made genome assembly more difficult, but it also stimulated the development of new alignment algorithms for genome assembly

6
New cards

How are tags added to DNA fragments?

Adding known sequences to the ends of amplified DNA by ligation

7
New cards

What is Clonal Amplification in HTS?

  • The process of generating millions of identical copies (a "clonal cluster" or "colony") of a single DNA template molecule in a localized area

  • Emulsion PCR: a method to amplify DNA by creating thousands of tiny, separate water-in-oil droplets, each acting as a miniature PCR tube, ideally containing a single DNA template

  • Bridge Amplification PCR: a technique used in next-generation sequencing (NGS) to create clonal clusters of DNA on a solid surface (flow cell) by repeatedly amplifying single DNA fragments anchored to the surface, forming "bridges" as strands fold and prime off each other, allowing for robust, localized signal amplification for sequencing

8
New cards

What is Ion Torrent Semiconductor Sequencing?

A next-generation DNA sequencing method that uses a semiconductor chip to detect electrical signals (hydrogen ions) released when DNA bases are added, rather than using light or cameras

9
New cards

What is Illumina Sequencing by Synthesis?

A NGS technology that builds a complementary DNA strand, base-by-base, detecting each added nucleotide using fluorescent labels, enabling millions of DNA fragments to be sequenced in parallel for high-throughput, accurate genomic analysis

10
New cards

What is PacBio Single Molecule Real Time (SMRT) Sequencing?

  • DNA polymerase is immobilized at the bottom of a very small chamber, and it synthesizes a new strand of DNA on a template

  • The synthesis is detected in real time by fluorescent labels on the nucleotides

  • These nucleotides are phospho-labelled so that the fluorescent tag is cleaved off during the addition of each nucleotide, releasing light, which is monitored in real time (NO PCR)

11
New cards

What is Oxford Nanopore Sequencing?

  • Fragments are spread into membrane wells containing nanopores

  • A motor protein is attached/bound to the DNA fragments

  • The motor protein ratchets the stand through the pore

  • Nucleotide combination passing through pore creates a characteristic current disruption that is interpreted as a specific base (NO PCR)

12
New cards

What is WGS?

  • Organism’s complete DNA sequence - all genes and everything in between

  • Assesses all variation, rather than just specific areas

    • Considered the most specific fingerprint

    • Can distinguish between closely related organisms/strains

    • Can be used to identify relationships between strains - ID movement/spread of pathogens

13
New cards

What are the steps of read cleaning?

  1. Primer/Adapter removal

  2. Duplicate removal

  3. “Contaminated” read removal

  4. Low quality read removal

14
New cards

What are quality scores?

  • Each base is given a Phred (Q) score, a numerical value representing the quality of a base in sequencing data; it is a measure of the probability of an incorrect base call at a specific position in the sequence

  • If error for a given site is high one can re-assay the site to lower error

15
New cards

What is base calling accuracy?

  • Once a variable region has been identified, one must determine if the sequences used in the prediction of the variable regions are correct

  • Genome sequences are not always 100% correct

  • Strive for many time coverage → ax coverage means each nucleotide in the genome was contained in a separate reads, IF, the clone library was perfectly random, it would imply that 99% of the nucleotides are represented in at least one read

16
New cards

What is genome sequencing assembly?

  • Involved taking all the reads from whole genome shotgun sequencing and assembling them into the DNA molecule or molecules that comprise the organism’s genome: bacteria or fungi

  • First accomplished with Haemophilus influenzae and then other bacterial genomes

  • Proven feasible in animals by the assembly of the Drosophila melanogaster genome

  • Problems: repetitive sequences with short read data

17
New cards

What is the draft vs finished sequence?

  • Finished sequences are those in which all gaps are closed and the quality of the sequence of the entire genome is raised so that the error rate is less than 1/10,000 bases

  • Academic interest in full annotation or the organism’s genes, this is also highly desirable for at least one isolate/strain of any pathogen used in forensic investigations

  • If this is completed once for a given microbial species, the value of a second finished sequence may be limited, high quality draft sequences are probably sufficient for mapping SNPs and VNTRs unless the level of variability is low

  • High variability in some viruses requires multiple finished sequences

18
New cards

What is annotation?

  • Defining any special features of knowledge about specific regions of a sequence

  • Generally, the focus is on “gene regions, but non-ORF regions also have information - only a small percentage of the mammalian genome is transcribed for protein production

  • Annotation tracks - regions conserved across all strains/isolated, can be further defined if regions within these conserved regions are unique when compared to all other known sequences (signature regions that represent an annotation feature)

19
New cards

What are diagnostics?

  • Many contexts

    • Culture-based

    • Clinical

    • Genomics-based

    • PCR-based

    • Rule out

  • Detection diagnostics: Presence of a pathogen at a broad level (family, genus, or species)

  • Forensic diagnostics: Strain and isolate level ID

20
New cards

What is microbial forensics diagnostics?

  • Protein Diagnostics

    • Relied on proteins on the surface of pathogens

    • Low sensitivity

    • Low cost, fast

    • Assumes thousands of copies present

  • Nucleic Acids Diagnostics

    • Highly sensitive

    • Requires DNA extraction/amplification

  • Both can be “tricked” by molecular manipulation

  • The goal is to support the rapid identification of microbes at different levels of resolution

  • Techniques used for this depend upon having an accurate determination of the variable sequence regions that direct the development of the diagnostic assay

  • It is possible that diagnostics based upon identity could be tricked by adding, subtracting, or altering genes

21
New cards

What are nucleic acid diagnostics?

  • Based upon nucleic acid signatures, sequences that are conserved across all strains of a pathogen and yet unique to that pathogen - risk of false-positive or -negative failure

  • Signatures and pathogen populations may not be static - check frequently for genetic change and presence in other species

  • Many of these diagnostics are PCR-based assays and some are chip-based assays

22
New cards

What are some cost considerations?

Species-level detection diagnostics

  • Broad detection to determine bioterror agent release or natural spread of a pathogen

  • Cost effectiveness depends upon the nature of the pathogen

  • Variola virus is rarer; therefore, can use generic inexpensive means

23
New cards

What are PCR-nased detection and assays?

  • AFLPs, RFLPs

  • MLVA (VNTRs; repeat sequence typing)

  • MLST

  • SNPs

  • rRNA Typing

  • Real-Time PCR (qPCR)

  • Molecular Beacons

24
New cards

What is the process of PCR?

  1. Denature

  2. Anneal

  3. Extend

25
New cards

How does the 5’ Nuclease TaqMan qPCR work?

Uses a special probe with a reporter dye (5' end) and quencher (3' end) that "turns off" fluorescence when intact, plus Taq polymerase and primers, to detect specific DNA sequences in real-time

26
New cards

What are molecular beacons?

A form of detection where hybridization of the probe results in increased florescence signal

27
New cards

What are handheld qPCR platforms?

Insulated isothermal PCR (iiPCR) technology

28
New cards

What are Chip-Based detection assays?

  • Combining VNTR, MLVA, and SNP variation detection in a single platform

  • Fluorescently labelled sample

  • Chips can hold 500,000 or more oligomer probes

    • Many species targeted at once

  • Compare hybridization intensity to determine identification

    • Perfect match and perfect mismatch pairs

  • Limitation: high cost of chip design and printing, per-use, and keeping current as new strains are sequenced, optimization of hybridizations to thousands of probes can lead to difficulty in interpreting results