Bioinformatics

Page 1: Bioinformatics Overview

  • Title: Bioinformatics (Data Handling)

  • Presenter: Dr. Louise Robinson (Email: L.Robinson@derby.ac.uk)

  • Key Statement: "We finally have it all mapped; we just don't know where it leads us."

  • Related Organization: Genome CURS Synovite Inc. (Website: www.tonille.com)

Page 2: Definition of Bioinformatics

  • Bioinformatics is defined as the use of software to analyze and interpret biological data, particularly sequence data, to gain insights into its biological significance.

Page 3: Growth of Molecular Data

  • Graph representation shows a significant increase in molecular data over 23 years with base pairs in billions, particularly from the National Library of Medicine and NCBI, highlighting user growth and various databases including:

    • GenBank

    • NIH Public Access

    • Genome Reference Consortium

    • Genetic Testing Registry

    • 1000 Genomes

  • Users per weekday reached millions by 2012.

Page 4: High-Throughput Sequencing

  • Growth in the Sequence Read Archive database is depicted in terabases (10^12) from 2010 to 2020, indicating the increased volume of sequencing data.

Page 5: The Central Dogma of Molecular Biology

  • Emphasizes the relationship between DNA, RNA, and proteins:

    • DNA undergoes replication, transcription to mRNA, which is then translated to protein.

Page 6: Omics Technology

  • Terminology: "OMICS" (Ome – Whole; -ics – Study)

    • Key areas studied include:

      • Genome (DNA)

      • Transcriptome (RNA)

      • Proteome (Protein)

  • Various databases are continuously updated for molecular biology studies.

Page 7: DNA Structure

  • Structure of DNA involves:

    • Nitrogenous bases: Adenine (A), Thymine (T), Guanine (G), Cytosine (C)

    • Sugar-phosphate backbone

    • Base pairing rules: A-T and G-C

    • Important features: major and minor grooves, reading direction (5' to 3').

Page 8: Understanding DNA Mutation

  • While the structure of DNA is well understood, the implications of mutations (e.g., single base changes) on DNA folding and accessibility remain less clear.

  • Advances like CRISPR stem from our sequencing knowledge.

Page 9: Complementary DNA Sequences

  • Example DNA sequences demonstrate the complementary nature of DNA strands, emphasizing that only one strand is typically deposited in databases; understanding requires deriving the complementary strand.

Page 10: Transcription Process

  • In transcription, the sense strand (5'-3') and the antisense strand (3'-5') demonstrate that mRNA is produced from the antisense strand resulting in positive sense RNA.

Page 11: RNA Structure and Protein Production

  • In RNA, Uracil (U) replaces Thymine (T) when base-pairing with Adenine.

  • Splicing process removes introns, contributing to the differences in length and sequence between DNA and mRNA, as protein coding can be split across various genome regions.

Page 12: Reverse Transcription

  • Reverse transcription converts unstable RNA into complementary DNA (cDNA).

  • Using reverse transcriptase in lab processes allows for examination and analysis of RNA through its DNA representation.

Page 13: Reverse Transcription in Viruses

  • Overview of reverse transcription process in retroviruses, showing the lifecycle from viral RNA to DNA integration in the host nucleus.

Page 14: Real-Time PCR Process

  • A description of real-time PCR as a method for amplifying DNA from clinical samples, including RNA extraction and reverse transcription leading to test results for viral presence.

Page 15: cDNA Creation

  • Details on using reverse transcriptase to transcribe mRNA into cDNA, which is then sequenced and stored in databases; significance lies in lacking introns.

Page 16: Protein Translation

  • Description of the translation process from mRNA to protein, facilitated by ribosomes reading mRNA codons (from 5' to 3') and adding corresponding amino acids consecutively.

Page 17: Genetic Code Breakdown

  • An example of coding a DNA sequence into an amino acid chain, illustrating the triplet codon format and resultant amino acids, including a stop signal.

Page 18: RNA Splicing Mechanism

  • Exposition on RNA splicing; explainer of how unnecessary coding regions (introns) are removed to yield a coherent mRNA sequence that reflects the needed protein.

Page 19: The Genetic Code and Ribosomes

  • Discusses the ribosomal structure which facilitates translation; an explanation of codon recognition processes in the formation of proteins from amino acids.

Page 20: Codon Table

  • Detailed codon table illustrating the relationships between nucleotide triplets and respective amino acids, including stop codons and their designations.

Page 21: Reading Frames Importance

  • Explains the significance of where the reading frame begins (start codon) and its implications for correct protein synthesis.

Page 22: Protein Sequencing and Databases

  • Description of protein sequences using single-letter amino acid notation, focusing on the orientation (N to C terminus).

Page 23: Complex Gene Structure

  • Overview of the gene structure showing regulatory sequences, promoters, exons, introns, and untranslated regions.

Page 24: Applications of Bioinformatics

  • Lists the diverse applications of bioinformatics in:

    • Proteomics

    • Gene expression analysis

    • DNA sequencing

    • Immunoinformatics

    • Genome analysis

    • Evolutionary biology.

Page 25: Nucleotide Databases

  • Summary of major nucleotide databases which house extensive DNA and protein sequences, including:

    • GenBank (NCBI)

    • EMBL (EBI)

    • DDBJ (Japan)

  • Daily data exchange ensures consistency across platforms.

Page 26: Protein Databases

  • Lists protein databases:

    • Swiss-Prot

    • TrEMBL

    • PIR

    • UniProt

  • Describes analyses and categorizations available through the ExPASy system.

Page 27: Initial Steps After Sequence Generation

  • Introduction to processes involved post-sequencing for analysis and interpretation, utilizing molecular markers for genetic analysis.

Page 28: BLAST Overview

  • BLAST (Basic Local Alignment Search Tool) enables searches for significant sequence matches in nucleic acid and protein databases based on similarity with queried sequences.

Page 29: Challenges in Sequence Assembly

  • Illustrates challenges faced during DNA sequence assembly, underscoring the importance of computational analysis and sequencing techniques to piece together overlapping sequences effectively.

Page 30: Algorithmic Assembly

  • Details on complexity of algorithmic assembly processes that analyze genetic fragments to find relationships and reconstruct full sequences.

Page 31: NCBI BLAST

  • Description of NCBI BLAST and its capabilities to find homologies between biological sequences, offering access to multiple BLAST tools for diverse applications.

Page 32: G-Query Functionality

  • G-Query facilitates the retrieval of various information regarding sequences in nucleotide databases, showcasing its analytical capabilities.

Page 33: Search Capabilities in NCBI

  • Lists diverse databases and search categories available within the NCBI framework, connecting to various biological data resources related to literature, genes, and proteins.

Page 34: Bioinformatics Practical Tasks

  • An outline of practical tasks guiding students on:

    • Performing nucleotide BLAST

    • Interpreting GenBank entries

    • Mapping gene structures through genomic cDNA alignments.

Page 35: Annotated Gene Structure Example

  • A detailed breakdown of an annotated gene sequence illustrating UTRs, exons, introns, and other relevant features impacting gene structure understanding.

Page 36: Practical Session Preparation

  • Advises students on preparative steps before practical sessions emphasizing organization, documentation, and sequence analysis instructions.

Page 37: Employment Relevance of Bioinformatics

  • Discussion on how bioinformatics has numerous applications in biological and forensic fields, aligning educational levels with relevant job expectations.

Page 38: Research References

  • Citation references from various forensic science international publications focused on the application of real-time PCR and transcript profiling in forensic contexts.

robot