1/29
Flashcards reviewing key concepts from the lecture including genome and transcriptome assembly, annotations, sequencing technologies, and relevant algorithms.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Zoonemia
A project assembling genomes from 240 different mammals.
Genome Assembly
Taking sequences and putting them together for comparison.
Illumina Sequencing
Fragmented pieces of DNA are sequenced, and mapped to a reference genome if available. If not, assembly is done de novo.
Repetitive Elements
A major challenge in genome assembly due to their repetitive nature.
Human Genome
The only reference genome that has been fully completed.
T to T
Telomere to telomere; a complete version of the human genome.
Telomeres
The ends of chromosomes.
Genome Annotation
Adding meaning to the genome, including structural and functional aspects.
Structural Annotation
Locating genes within the genome (e.g., central region of chromosome 1).
Functional Annotation
Determining the functional importance of different parts of the genome; involves gene ontology and pathway analysis.
Genome Assembly Steps
Reads are assembled into contigs, and then into scaffolds.
N50 Value
A statistical measurement indicating the minimum contig length needed to cover 50% of the genome.
Reference-Based Assembly
Assembly using an existing genome as a template (e.g., chimpanzee genome using the human genome).
De Novo Assembly
Assembly performed without a reference genome, requiring assembly from scratch (e.g., for endemic species).
Sequencing Steps
Quality control, trimming, contamination removal, k-mer counting, genome size estimation, error correction, and De Novo assembly.
De Novo Assembly Algorithms
Algorithms used for De Novo assembly; examples include Velvet (using the Beyond graph algorithm) and Trinity (for RNA).
Combining Sequencing Technologies
Short read next-generation sequencing, like Illumina, provides abundant data, while long-read sequencing (PacBio or Oxford) fills gaps.
SAIC
Algorithm example that uses overlap layout.
Velvet
Algorithm example that uses overlap beyond graph.
Genome Properties Affecting Assembly
Genome size, heterozygosity, and GC content.
In Silico Annotations
Refers to predicting open reading frames, signaling, coding/non-coding regions, and splicing regions.
Experimental Validation
Uses experimental data to validate predictions and determine gene functions.
Mature mRNA
Five prime untranslated regions, start codon, stop codon and polyadenylation signal.
Splicing
Alternative splicing events and different transcript availability occur during this.
Transcriptome Assembly
Assembly of RNA sequences to identify isoforms and splicing architectures, using Trinity for De Novo assembly or cufflinks for reference-based assembly.
Trinity Strategies
Three module that use a KD DeBjorn graph approach: Inchworm, Chrysalis and Butterfly.
Transcriptome Assembly Analysis
Requires paired-end RNA sequencing, followed by De Novo assembly and alignment.
Allen Wilson and Marie Claire King Publication
Expression gives additional information, which is useful for comparisons.
Engene Myers
Developed the first assembly program. (1995)
David Hausler
Scientific Director still working on UCSC Genome Browser.