Biochem Dec. 8th

Introduction

  • Overview of the session and the topic.
    • A question-and-answer session will take place on Wednesday.
    • Mention of possible weather conditions (snowstorm) affecting attendance.
    • Encouragement to attend if possible or engage with online resources.

Sequencing Genomes

  • Importance of Genome Sequencing
    • The human genome and other organisms (e.g., mice) have been sequenced multiple times.
    • Continuous efforts are made to sequence new genomes.
    • Transgenic organisms (e.g., transgenic mice) require genome sequencing to identify genetic modifications.

Brief Overview of Genome Editing

  • Genome Editing Concept
    • Essential for introducing plasmids into organisms such as E. Coli to express proteins.
    • In mammals, direct genome editing is required because plasmids cannot be used, e.g., modifying genes in mice or fish (e.g., Glofish).
  • Examples of Genetically Edited Organisms
    • Glofish: genetically modified fish that express colorful proteins.
    • Mice: genetically modified to express obesity-related genes.
  • CRISPR Technology
    • A widely-used method for genome editing.
    • CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats.
    • Emmanuelle Charpentier and Jennifer Doudna awarded the Nobel Prize in 2020 for CRISPR-Cas9 technology.
  • Function of CRISPR
    • Acts as an immune system for Akia and bacteria against viruses by incorporating viral DNA fragments into their genomes.
    • Provides a defense mechanism to recognize and cut viral DNA upon re-infection.
    • Cas9 protein recognizes viral DNA by base pairing and functions as an endonuclease to cut precise DNA sequences.
  • Application in Genome Editing
    • Guide RNA directs Cas9 protein to specific gene sequences to be modified in laboratory organisms.
    • Example: Editing the FOXP2 gene in mice to observe its effects on voice phenotypes.

Genome Sequencing Techniques

  • Traditional Sequencing
    • Regular sequencing reads typically range from 500-1000 base pairs.
    • Example of a sequenced organism: Camphylobacter jejuni (1995), first self-replicating free-living organism sequenced, with a genome size of 1,800,000 base pairs.
  • Challenges in Sequencing
    • Traditional method requires generating new primers after sequencing each section, leading to lengthy processes (3600 weeks in example).
  • Shotgun Sequencing
    • A more efficient method that sequences random fragments of DNA.
    • Steps include:
    • Isolating and fragmenting the bacterial genome.
    • Analyzing fragments approximately 1.6 - 2 kb in size.
    • Ligating fragments to plasmids for E. coli transformations.
    • Sequencing using primers binding to plasmids rather than directly to genomic DNA.
    • Alignment Procedure
    • Overlapping sequences from various fragments help in assembling the entire genome sequence.
    • Continuous sequences (contigs) formed from overlapping reads.

Next Generation Sequencing (NGS)

  • Overview of Illumina Sequencing
    • Emerged in 2006, revolutionizing speed and cost of sequencing.
    • Enables millions of reads in parallel, with lower costs than previous methods.
    • Reduced sequence reads to approximately 15-20 bases long.
  • Illumina Sequencing Process
    • DNA fragmentation remains similar but with shorter fragments (around 50 bases).
    • DNA fragments are linked to a glass plate using adapters, amplifying the signal.
    • Sequential addition of fluorescently labeled nucleotides, with imaging after each cycle to determine the sequence based on emitted signals.

Annotation of Genomic Sequences

  • Moving from Sequence to Function
    • Identification of relevant features including open reading frames (ORFs) for proteins, tRNA genes, promoter sequences, and regulatory elements.
  • Finding Open Reading Frames
    • Analysis must be done in six frames (three on each DNA strand) due to the triplet nature of codons.
    • Searching for methionine (start codon) and ensuring sequences are sufficiently long (at least 30 amino acids).
    • Identification of codon bias for various organisms influencing their preferred codons for specific amino acids.
  • Basic Local Alignment Search Tool (BLAST)
    • Used to determine if protein sequences are found in other organisms.
    • Outputs alignments comparing the query sequence to known sequences with corresponding similarity metrics.
    • Identify conserved amino acids, indicating functional relevance across species.

Conclusion

  • After identifying protein-coding potential, laboratory verification includes checking for mRNA expression and protein confirmation using techniques such as mass spectrometry.
  • Encouragement for thorough preparation ahead of exams regarding genome sequencing and related concepts.