Ch. 17- Intro to Bioinformatics

Introduction to Bioinformatics

  • Presenter: Dr. Toby

  • Course Context: General Biology I

Getting Started

  • Video Resource

What is Bioinformatics?

  • General Definition:

    • Computational techniques for solving biological problems.

    • Data Problems:

    • Representation (graphics)

    • Storage and Retrieval (databases)

    • Analysis (statistics, artificial intelligence, optimization, etc.)

    • Biology Problems:

    • Sequence analysis

    • Structure or function prediction

    • Data mining

  • Also known as “Data Science” for Biology

Chapter Outline

  1. Biotechnology

  2. Mapping Genomes

  3. Whole-Genome Sequencing

  4. Applying Genomics

  5. Genomics and Proteomics

Introduction

  • The study of nucleic acids began with the discovery of DNA and has since evolved into genomics.

  • Genomics:

    • Study of entire genomes, including:

    • Complete set of genes

    • Nucleotide sequence and organization

    • Genetic interactions within and between species

  • Advances in genomics enabled by DNA sequencing technology.

  • Analogous to information technology tools like Google Maps, genomic maps support various fields, including anthropology and medicine (e.g., studying human migration and mapping genetic diseases).

Molecular Biology Basics

DNA

  • DNA can be treated as the "recipe" for organisms made of nucleotides.

    • Four different nucleotides distinguished by bases:

    1. Adenine (A)

    2. Cytosine (C)

    3. Guanine (G)

    4. Thymine (T)

  • Structure:

    • DNA is a polymer made of repeating units (nucleotides) and can be viewed as a string of letters: A, C, G, T.

    • Example sequence: ctgctggaccgggtgctaggaccctgactgcccggg…

The Double Helix

  • DNA typically consists of two strands twisted into a double helix structure.

  • Watson-Crick Base Pairing:

    • A pairs with T

    • C pairs with G

  • Structural Components:

    • Phosphate Molecule

    • Deoxyribose Sugar Molecule

    • Weak Hydrogen Bonds between Nitrogenous Bases

    • Sugar-Phosphate Backbone

Directionality of DNA Strands

  • Each strand has a direction, denoted as 5' and 3' ends:

    • Starts with 5' (terminal carbon of the sugar) and ends with 3'.

    • The DNA strands run antiparallel to each other.

DNA Replication Prior to Cell Division

  • Illustrated process of complementary strand synthesis from parent strands.

  • Example:

    • Parental Strand: ATCG

    • New Strand: TAGC

  • RNA involvement in transcription indicated.

Chromosomes

  • DNA is organized into chromosomes:

    • Prokaryotes:

    • Typically have a single circular chromosome (e.g., bacteria, archaea)

    • Eukaryotes:

    • Have multiple linear chromosomes unique to each species (e.g., plants, animals, fungi).

Human Chromosomes

  • Human genome consists of 23 pairs of chromosomes, including sex chromosomes (X and Y).

Genomes

  • Definition: Complete set of DNA for a given species.

    • Human genome: 23 pairs of chromosomes.

    • Example organism genome counts:

    • Mosquitoes: 3 pairs

    • Camels: 35 pairs

  • Every cell (excluding sex cells and mature red blood cells) contains the complete genome of an organism.

Genes

  • Genes as the basic units of heredity:

    • Definition: A sequence of bases that carries information for constructing a specific protein (polypeptide).

    • Encoding Proteins:

    • The human genome has approximately 25,000 protein-coding genes.

Gene Density

  • Not all DNA encodes proteins:

    • Bacteria: ~90% coding genes per kilobase

    • Humans: ~1.5% coding genes per 35 kilobases

RNA

  • RNA: Similar to DNA with distinctions:

    • Different backbone configuration

    • Often single-stranded

    • Base Uracil (U) replaces Thymine (T)

    • Structure represented as a string of characters: A, C, G, U.

Transcription

  • Enzyme RNA polymerase builds an RNA strand from a gene to create messenger RNA (mRNA).

  • Example transcription process:

    • Coding strand of DNA example:

    • DNA: ATGCCGTTAGACCGTTAGCGGACCTGAC

    • mRNA: AUGCCGUUAGACCGUUAGCGGACCUGAC

Proteins

  • Definition: Molecules composed of one or more polypeptides.

  • Polypeptides: Chains of amino acids, made from 20 different amino acids.

  • Functions of Proteins:

    • Structural support

    • Storage of amino acids

    • Transport of substances

    • Coordination of activities

    • Response to stimuli

    • Movement

    • Disease protection

    • Acceleration of chemical reactions

Amino Acids

  • List of standard amino acids with three-letter abbreviations:

    • Alanine (Ala)

    • Arginine (Arg)

    • Aspartic Acid (Asp)

    • and others (complete list through Tyrosine (Tyr) and Valine (Val)).

Example Amino Acid Structure: Hexokinase

  • Illustrative amino acid sequence:

    • 1: ASX2

    • 2: D

    • … (sequence continues up to a detailed length).

  • Involved in glycolysis across organisms.

Hemoglobin

  • Structure: Made from 4 polypeptides.

  • Function: Responsible for oxygen transport in red blood cells.

Translation

  • Ribosomes synthesize proteins from mRNA.

  • Structure: Organization of codons in a reading frame.

  • Begins at start codon and ends at stop codon.

Codons and Reading Frames

  • Example Codon Sequence:

    • Codon 1: UUU

    • Codon 2: UUC (encoding Phenylalanine, an amino acid)

    • … (additional codons for a full genetic code).

RNA Processing in Eukaryotes

  • Eukaryotes: Organisms with enclosed nuclei (animals, plants, fungi, etc.).

  • Characteristic: Genes/mRNAs consist of alternating segments of exons and introns:

    • Exons: Coding parts retained for translation.

    • Introns: Non-coding parts spliced out before translation.

RNA Splicing

  • Process Diagram:

    • Gene: Exon1, Intron1, Exon2, Intron2, Exon3

    • Result after transcription: Exon1, Exon2, Exon3 (processed mRNA).

Protein Synthesis: Eukaryotes vs Prokaryotes

  • In eukaryotes: Introns spliced out before mRNA is exported to cytoplasm for translation.

  • Comparison chart of processes between eukaryotic and prokaryotic systems.

Impact of DNA Sequence Variation

  • Example Gene A sequences demonstrating how changes can affect the amino acid produced.

  • Discussion on genetic variation impacts on phenotypes and traits.

RNA Genes

  • Not all genes produce proteins; some encode RNA products such as:

    • Ribosomal RNA (rRNA)

    • Transfer RNA (tRNA)

    • Micro RNAs (miRNAs)

The Dynamics of Cells

  • All cells share the same genomic data; however, gene expression varies by cell type, time, and environment.

  • Networks exist for biochemical interactions including:

    • Metabolism

    • Signaling

    • Gene regulation

Overview of Metabolic Pathways

  • Illustration: E. coli pathways outlining various metabolism routes (carbohydrates, lipids, amino acids, etc.).

Databases in Bioinformatics

  • Reference to Kyoto Encyclopedia of Genes and Genomes (KEGG) for mining molecular datasets and metabolic pathways.

  • Various genomic and protein databases listed with specific entries and significant statistics.

Significance of Genomics Revolution

  • Data-Driven Biology:

    • Functional genomics

    • Comparative genomics

    • Systems biology

    • Molecular medicine (gene therapy, pharmacogenomics)

    • Toxicogenomics

Summary of Bioinformatics Focus

  • Focus on representation, storage, retrieval, and analysis of biological data:

    • Including sequences, structures, functions, and activity level interactions among biomolecules.

    • Encompassing textual data from literature.