DNA replication principles: DNA sequencing involves processes similar to DNA replication but occurs in vitro.
Dideoxynucleotides (ddNTPs): Chain terminators used in DNA sequencing.
Example DNA strand:
Primer Extension Scenarios:
Daughter strand lengths: Based on ratio of normal and ddNTPs in the mixture. E.g., for G with 85% dG and 15% ddG, the lengths produced may vary (8, 9, 11, 12, 13, 14, 20, 26).
Essential Ratio of ddNTPs to dNTPs: Keeping ddNTPs at a low proportion is crucial to avoid early termination of all strands. Too high a ratio leads to biased underrepresentation of certain sequences.
Fluorescent ddNTPs: Each type carries a unique fluorescent tag. Single daughter fragments generally do not carry more than one color due to termination by a ddNTP.
Purpose of Size Separation: Necessary to identify nucleotide sequences; shorter fragments indicate early sequencing events.
Sanger Sequencing Length: Typically about 1000 base pairs can be sequenced in a single reaction.
Definition of Genome: All genetic material; each cell contains two copies in somatic cells and one copy in gametes.
Sequencing challenge: Eukaryotic genomes are large (up to 10^9 base pairs). Short sequences must be compiled using methods like Sanger sequencing, which yield only a few hundred nucleotides at a time.
Shotgun Sequencing: A method to piece together complete genomes:
Importance of Random Fragments: Overlapping sequences are necessary to reconstruct longer sequences accurately.
Sequencing multiple times (10-50): Increases confidence in results and reduces errors.
Read Depth: This indicates how many times a specific region is sequenced; variables exist across the genome.
Improved Assembly with Related Genomes: Understanding a related genome can simplify the assembly process by providing reference points.
Functionality: Genome annotation categorizes sequences into functional elements, including genes and regulatory sequences.
Patterns for Recognition: Sequence motifs can identify important areas, such as promoters and open reading frames (ORFs). ORFs represent stretches of DNA that may code for proteins but do not guarantee functionality if not confirmed by additional evidence.
Transcriptome: mRNA sequences reveal which genes are expressed; comparing genomic DNA to mRNA facilitates understanding gene structure by indicating exon-intron relationships.