1/86
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai | Chat |
|---|
No analytics yet
Send a link to your students to track their progress
what is the principle of operation of Sanger technology
chain termination
What is Chain Termination?
Uses fluorescently labeled ddNTPs to stop DNA synthesis @ specific bases
ddNTPS = dideoxyribonucleoside triphosphates = modified DNA building blocks
what is the principle of operation of Illumina technology
sequencing by synthesis
What is Sequencing by Synthesis?
Reversible terminator fluorescent nucleotides are added; a camera records the "flash" of each base.
what is the principle of operation of PacBio technology
SMRT (Single Molecule Real Time)
What is SMRT (Single Molecule Real Time)?
Uses a ZMW (Zero-Mode Waveguide) to observe a single polymerase incorporate bases in real-time
what is the principle of operation of MinION technology
nanopore
What is Nanopore?
Measures changes in electrical current as a single-stranded DNA molecule passes through a protein pore
what uses fluerescently labled ddNTPs to stop DNA synthesis at specific base
Sanger
what uses reversible terminator fluorescent nucleotides are added; a camera records the "flash" of each base
Illumina
what uses a ZMW (zero ode waveguide) to observe a single polymerase incorporate bases in real-tine
PacBio
what measures changes in electrical current as a single-stranded DNA molecule passes through a protein pore
MinION
what are pros to sanger technology
very high accuracy, long reads (800bp)
what are cons to sanger technology
low throughput, very expensive per base. used for single genes/plasmids
what are pros of illumina technology
massive throughput (billions of reads), lowest cost per base, very accurate
what are cons of illumina
very short reads (150-350bp), difficult to assemble repetitive regions
what are pros of pacbio
long reads, can span repetitive regions and detect base modifications
what are cons of pacbio
historically higher error rates, lower throughput than illumina, higher cost
what are pros of minion
ultra long reads, real time data, highly portable (USB)
what are cons of minion
high error rate compared to illumina; requires high-quality dna input
De Novo Assembly
the process of taking short, overlapping DNA reads and stitching them together into contigs (contiguous sequences) without a reference genome
what type of reads make assembly easier
longer reads (PacBio/MinION)
why do longer reads make assembly easier?
they bridge repetitive "dead zones" that short illumina reads cannot
True or False: 200bp for Illumina sequencing is good as it only produce the
short reads but for PacBio it is low quality as the sequencing technique always produce
longer reads.
True
Ambiguous Bases
Represented by the letter N. A high "N" count indicates areas
where the sequencer couldn't determine the base. Lower is better.
what does a high N count indicate
areas where sequencer couldnt determine the base
what type of N count is better
lower
what is the PHRED Quality Score (Q)
measure of the quality of the identification of the nucleobases
what is the standard benchmark for "high quality"
Q30 (>99.9% accuracy)
Coverage (Depth)
the average number of times a base is sequenced
What does a higher coverage increase?
confidence in consensus sequence
Contig Length & N50
statistical measure of assembly "contiguity."
It is the length of the shortest contig such that all contigs of that length or longer sum to 50% of the total assembly
N50 is a statistical measure of…
assembly "contiguity"
What type of N50 indicates a better assembly?
higher N50
What occurs due to repetitive sequences or regions that are difficult to sequence?
gaps
ex: high GC content
What are the 2 things used to close gaps?
Primer Walking & Hybrid Assembly
Primer Walking
designing specific primers at the end of a contig & using Sanger sequencing to "walk" into the gap
Hybrid Assembly
Combining Illumina (for accuracy) with PacBio/MinION (to span long gaps)
What gap-closing method is normally used & why?
Hybrid Assembly
cheap & efficient
What are the 2 main strategies for Hybrid Assembly?
Short-read First (Scaffolding) . . . OR . . .Long-read First (Polishing)
Short-read First (Scaffolding)
You assemble the highly accurate Illumina reads into contigs.
You then use the Long Reads as "bridges" to tell the assembler which contigs sit
next to each other, creating a scaffold.
Step 1 in short-read first (scaffolding)
assemble highly accurate illumina reads into contigs
Step 2 in short-read first (scaffolding)
use long reads as "bridges: to tell assembler which contigs sit next to each other, creating scaffold
Long-read First (Polishing)
You assemble the genome using only the Long Reads. This creates a very
"complete" genome (few gaps) but with many small "typos" (Indels).
You then map the Short Reads onto that assembly to "polish" it—correcting
the 1% error rate of the long reads with the 99.9% accuracy of Illumina.
Step 1 in long-read first (polishing)
assemble genome using only long reads
Step 2 of long-read first (polishing)
map short reads onto that assembly to "polish" it
.
.
What is the role of a polisher in Hybrid Assembly?
fixes typos
What is the role of the bridge in Hybrid Assembly?
spans the repeats
Short Reads (Illumina) Overview
Accuracy: High (Q30+)
Contiguity: Fragmented (many gaps)
Cost: Cheap
Role: Polisher
Long Reads (PacBio/Nanopore) Overview:
Accuracy: lower
Contiguity: High (can produce “closed” genomes)
Cost: Expensive
Role: Bridge
Node
represents a common ancestor where lineage splits
Branches
the evolution of lineage over time
Tips (leaves)
represent the existing taxa (species/sequences) being compared
Clade (monophyletic group)
a group consisting of a common ancestor and all descendants
Root
the common ancestor of all sequences in the tree
Branch Length
the horizontal length correlates with the amount of genetic change (mutations) over time
What does the horizontal length correlate with in a phylogenetic tree?
the amount of genetic change (mutations) over time
Sister Taxa
2 lineages that emerged from same immediate node
Transcriptomics
the study of the transcriptome— the sum of all RNA transcripts in a cell
What is the sum of all RNA transcripts in a cell?
transcriptome
Transcriptomics: Microarray
Detection: hybridization — pre-designed probes
Novelty: can only detect known genes on the chip
Dynamic Range: low (signals saturate/are too weak)
Cost: fixed per sample
Trancriptomics: RNA-Seq
Detection: sequences all cDNA
Novelty: can discover new transcripts, isoforms, & non-coding RNAs
Dynamic Range: high (can detect low & very high expression)
Cost: depends on sequencing depth
Microarrary uses what type of probes?
pre-designed (hybridization)
Microarray can only detect what type of genes?
known genes on the chip
What is the cost of Microarray?
fixed per sample
RNA-seq has what type of sequencing?
direct sequencing (sequences all cDNA)
What type of method can discover new transcripts, isoforms, and non-coding RNAs?
RNA-seq
The logic that more mRNA molecules in the sample means…?
more cDNA produced
The logic that more cDNA is produced means…?
more reads mapped to that gene during sequencing
what is often used for differential expression?
log fold change (LogFC)
What does Log2FC of 1 mean? Of -1? Of 0?
1: expression has doubled
-1: expression has halved
0: expression is the same
What is the concept of bulk RNA-seq?
take a whole piece of tissue —> grind it up —> sequence total RNA
What do you get from bulk RNA-seq?
an average expression profile for entire sample
What are the pros of bulk RNA-seq?
High sensitivity for low-abundance transcripts,
Cost-effective
Robust for comparing “Condition A” vs “Condition B”
What is the con of bulk RNA-seq?
Masks heterogeneity.
if a rare cell type is the only one expressing a gene, its signal might be drowned out by the “average”
What is the concept of Single-Cell RNA-seq / scRNA-seq?
dissociate the tissue into individual cells & “barcode” RNA from each cell before sequencing
What do you get from scRNA-seq?
list of every gene expressed in every individual cell
What are the pros of scRNA-seq?
Can identify rare cell types
Allow for Pseudotime analysis (tracing how a cell changes over time, like during development or cancer progression…)
What are the cons of scRNA-seq?
You lose the "where":
Once you dissociate the cells, you don't know which cell
was sitting next to which
Prone to “dropout”
a gene is expressed but the sequencer misses it b/c starting material is so small
What is the concept of Spatial Transcriptomics?
You sequence the RNA while it is still attached to a thin slice of the tissue, using a slide with "spatial barcodes" that record coordinates
What do you get from Spatial Transcriptomics?
gene expression data mapped directly onto physical structure of tissue
Pros of Spatial Transcriptomics
Crucial for understanding microenvironments
(e.g., how immune cells interact
with the edge of a tumor).
Provides "geographic" context that scRNA-seq lacks
Cons of Spatial Transcriptomics
Expensive
Many platforms have lower resolution
(each "spot" might contain 5–10 cells rather than just one)
Bulk RNA-Seq overview
Resolution: Tissue level (average)
Cell Heterogeneity: Hidden/masked
Spatial Context: Lost
Best For: General biomarkers, comparing treatments
Single-Cell RNA-seq (scRNA-seq) overview
Resolution: Cellular level
Cell Heterogeneity: Fully visible
Spatial Context: Lost
Best For: Finding new cell types, developmental lineages
Spatial RNA-seq overview
Resolution: Tissue architecture
Cell Heterogeneity: Visible in context
Spatial Context: Preserved
Best For: Tumor microenvironments, brain mapping