BIOINFORMATICS (Module1).ppt
Page 1
BIOINFORMATICS
Module 1 Introduction
Page 2
"Artificial Life" Breakthrough
Craig Venter announces potential creation of the first artificial life form
Aimed to combat illness and global warming
Synthetic chromosome built from laboratory-made chemicals
Upcoming announcement expected soon
Page 3
Nobel Prize in Physiology or Medicine 2005
Awarded for the discovery of Helicobacter pylori’s role in gastritis and peptic ulcer disease
Recipients: Barry J. Marshall and J. Robin Warren
Page 4
Nobel Prize in Physiology or Medicine 2006
Awarded for the discovery of RNA interference
Recipients: Andrew Z. Fire and Craig C. Mello
Page 5
Nobel Prize in Physiology or Medicine 2007
Awarded for discoveries related to gene modifications in mice via embryonic stem cells
Recipients: Mario R. Capecchi, Oliver Smithies, and Sir Martin J. Evans
Page 6
Molecular Biology Advances
Over 500 years worth of research challenges in biology
DNA structure discovery in 1953 paved the way for molecular biology advancements
Increasing biological data requires interdisciplinary approaches involving math and computer science
Emergence of Computational Molecular Biology and Bioinformatics as fields
Page 7
Commercial Market Overview
Current bioinformatics market valued at $300 million/year
Predicted to grow to $2 billion/year in 5-6 years
Key bioinformatics companies include:
Genomatrix Software, Genaissance Pharmaceuticals, DeCode Genetics, etc.
Page 8
Computational Molecular Biology & Bioinformatics
Combination of computer science and mathematical techniques to solve molecular biology issues
Page 9
Bioinformatics Units
Basic Concepts
Suffix Trees and Applications
Sequence Alignment: Pairwise Alignment, Multiple Alignments
Sequencing
Motif Prediction
Page 10
Unit 1: Basic Concepts of Molecular Biology
Focus on Cellular Architecture, Nucleic Acids (RNA & DNA), DNA replication, repair, and recombination
Understanding transcriptions, genetic codes, and protein structures
Statistical methods including estimation, hypothesis testing, and Markov models
Page 11
Unit 2: Suffix Trees
Definition, examples, and algorithms (Ukkonen’s linear-time)
Applications include exact string matching, longest common sub-strings
Understanding pairwise sequence alignment (edit distances, dynamic programming)
Page 12
Unit 3: Sequence Alignment
Local pairwise sequence alignment
Need and methodology for multiple sequence alignments
Searches for similar sequences in databases (using FASTA, BLAST)
Page 13
Unit 4: Sequencing
Techniques including fragment assembly and sequencing by hybridization
Page 14
Unit 5: Motif Prediction
Motif prediction processes and methods for protein structure prediction
Page 15
Recommended Books
"Algorithms on Strings, Trees and Sequences" by Dan Gusfield
"Introduction to Computational Molecular Biology" by J. Setubal & Meidanis
"Statistical Methods in Bioinformatics" by W.J. Ewens & G.R. Grant
Page 16-18
Continuation of Recommended Literature
Works by R. Durbin et al., N.C. Jones & P.A. Pevzner, D.E. Krane et al., and more
Page 19
Class 2 Introduction
Page 20
Unit 1 Focus
Key elements: DNA, RNA, Protein, Genetic Code
Page 21
Craig Venter's Breakthrough
Overview of Venter’s creation of Mycoplasma laboratorium and its implications for global warming mitigation
Page 22
Basics of Genetics
One cell contains a copy of the genome (blueprint for individual traits)
Discussion of chromosomes as chapters in a genomic book containing genes
Page 23
Venter's Language Understanding
Insights into how Venter comprehended genetic coding language
Page 24
Venter's Genetic Advancements
Development of a synthetic chromosome with 381 genes to produce new life forms
Page 25
Advancements in Genome Creation
Historical context of genomic research culminating in Venter's achievements
Page 26
Bioinformatics Need Emergence
Clarifying the urgency in bioinformatics due to increased biological data complexity
Page 27
Computational Molecular Biology Explanation
Emphasis on CMB combining computer science and biology for problem solving
Page 28
Living vs. Nonliving
The distinctions based on movement, reproduction, and environmental interaction
Page 29
Characteristics of Living Organisms
The role of chemical reactions in sustaining life and the interaction with surroundings
Page 30
Origins of Life
Life began approximately 3.5 billion years ago, evolving into diverse forms compatible with earth's molecular chemistry
Page 31
Key Biological Molecules
Proteins define physical traits while nucleic acids convey genetic information
Page 32
Functions of Proteins
Various roles proteins play, such as enzymes, transport molecules, and cellular structure builders
Page 33
Amino Acids Overview
Explanation of hydrophobic and hydrophilic amino acid properties in protein construction
Page 34
Polypeptide Chain Structure
Description of polypeptide orientation from N-terminal to C-terminal
Page 35
Protein Structure Types
Different levels of protein structure: primary, secondary, tertiary, and quaternary
Page 36
Basic Genomic Code
Exploration of mRNA codon mapping and relationship to amino acids
Page 37
DNA Fundamentals
Definition of DNA structure focusing on nucleotides and base pairing
Page 38
DNA Molecular Structure
Description of DNA as a double-stranded helix with sugar-phosphate backbones
Page 39
Nucleotide Components
Components of nucleotides: sugars, phosphates, and nitrogenous bases
Page 40
Purines vs Pyrimidines
Explanation of base types in nucleotides (A, G as purines; C, T as pyrimidines)
Page 41
Complementary Base Pairing
Overview of Watson-Crick base pairing rules in DNA structure
Page 42
RNA Overview
Discussion on the structure and function of RNA compared to DNA
Page 43
Key Differences: RNA and DNA
Comparative analysis of DNA and RNA based on structure and roles in protein synthesis
Page 44
Class 3 Introduction
Page 45
Central Dogma of Molecular Biology
The flow of genetic information: DNA -> RNA -> Protein
Page 46
Transcription and Translation Processes
Overview of gene transcription to mRNA and subsequent translation to protein
Page 47
Intron-Exon Dynamics
Description of splicing introns from mRNA prior to protein synthesis
Page 48
Summary of Central Dogma
Visualization of transcription and translation processes from DNA to protein
Page 49
Concept of Junk DNA
Understanding of genetic regions without clear function termed as "junk DNA"
Page 50
Open Reading Frame (ORF) Definition
Description of ORF in DNA sequence and its significance in translation
Page 51
Genome Definition
Complete set of chromosomes characterizing species, with examples from humans and mice
Page 52
Genome as a Computer Program Analogy
Genome equated to a computer program governing organism functionality
Page 53
Class 4 Introduction
Page 54
Eye Development Gene Studies
Case study on the eyeless gene in fruit flies and its human counterpart
Page 55
Gene Function Comparison
Exploring functional similarities between eyeless and aniridia genes across species
Page 56
Historical Context of Sequence Analysis
Evolution of sequence analysis from manual methods to computer-assisted techniques
Page 57
Bioinformatics Tools Evolution
Advancements in software tools significantly impacting molecular biology practices
Page 58
Genome Study Techniques
Overview of sequencing and its challenges in studying human genetic materials
Page 59
Cutting and Manipulating DNA
Usage of restriction enzymes as tools for DNA manipulation
Page 60
DNA Cloning Processes
Methods of copying DNA using host organisms for amplification
Page 61
DNA Analysis Techniques
Gel electrophoresis as a primary method for DNA fragment analysis
Page 62
Overview of the Human Genome Project
Page 63
HGP Components
Multi-disciplinary research involving chemistry, biology, engineering, physics, ethics, informatics
Page 64
Objectives of the Human Genome Project
Aims to identify human genes, sequence the human genome, and address ethical concerns
Page 65
DOE Involvement in HGP
Historical context linking radiation studies to genome research
Page 66
Reference Genome Composition
First reference genome made from multiple individual samples across ethnicities
Page 67
Benefits of HGP Research
Advancements in medicine, agriculture, forensic science, and evolutionary biology
Page 68
Ethical Implications in HGP
Addressing concerns involving genetic data privacy, testing, and social issues
Page 69
Further HGP Information
Page 70
Collaborative Nature of HGP
Importance of databases and computational analysis for genome research
Page 71
Genetic Disease Treatment Advances
Pioneering results emerging from HGP data application for disease treatment
Page 72
Class 5 Introduction
Page 73
Understanding Databases
Definition and significance of databases in biological research
Page 74
History of Biological Databases
Timeline of significant developments in biological database systems
Page 75
Functions of Biological Databases
Roles of databases in data accessibility and computational research needs
Page 76
Database Types Overview
Different classes of biological databases based on data types and entry methods
Page 77
Data Quality Control Mechanisms
Importance of data curation and validation in biological databases
Page 78
Database Technical Design
Various database architectures employed in managing biological data
Page 79
Accession Codes and Identifiers
Explanation of how database entries are uniquely defined and identified
Page 80
Identifier Characteristics
Discussion on the nature of identifiers in database entries
Page 81
Accession Code Stability
Importance of stable accession codes for consistent entry tracking
Page 82
Primary Nucleotide Sequence Databases
Key examples (EMBL, GenBank, DDBJ) and their characteristics
Page 83
Detailed Description of Databases
Overview of EMBL, GenBank, DDBJ operational roles in sequencing data management
Page 84
Secondary Nucleotide Sequence Databases
Explanation of databases that build upon primary data for enhanced features
Page 85
Protein Sequence Databases Overview
Distinction of curated databases focusing on protein sequences
Page 86
SWISS-PROT vs PIR
Comparison of two notable protein databases, with emphasis on annotation quality
Page 87
PIR Database Insights
Overview of the Protein Information Resource’s capabilities and history
Page 88
Other Relevant Databases
Examples of databases catering to specific biological or genetic information needs
Page 89
Popular Biological Databases
Overview of well-regarded databases for ease of access and information consolidation
Page 90
Bioinformatics Database Resources
List of popular bioinformatics database websites for research and analysis
Page 91
Growth of GenBank
Visualization of the expansion of the GenBank database over time
Page 92
NCBI Overview
History, mission, and role in public databases and computational biology
Page 93
NCBI Database Overview
List of various NCBI database offerings for nucleotides and proteins
Page 94
Nucleotide Database Components
Comprehensive overview of available Sequence databases at NCBI
Page 95
NCBI Database Types
Differentiation between primary and derivative databases in the NCBI framework
Page 96
Entrez Database Search Engine
Summary of capabilities provided by NCBI's Entrez search engine
Page 97
Literature and Text Resource
Access to biomedical literature and related databases at NCBI
Page 98
Overview of Nucleotide Databases
Summary of primary nucleotide database statistics
Page 99
EMBL/GenBank/DDBJ Collaborative Nature
Description of how these databases synchronize sequences and data
Page 100
Protein Databases Overview
Insight into the features of major protein databases
Page 101
Secondary Protein Database Insights
Details on SWISS-PROT and PIR's notable features, advantages, and uses
Page 102
UniProt Description
Overview of UniProt as an extensive protein information repository
Page 103
NCBI Derivative Sequence Data
Example genetic sequences illustrating NCBI data curation methods
Page 104
High-throughput DNA Sequencing Visualization
Images depicting sequences and technological advancements in sequencing
Page 105
Data Growth in Bioinformatics
Trends in biotechnology and implications for computational bioinformatics
Page 106
Managing Information Overload
The role of bioinformatics in processing large amounts of biological data
Page 107
Bioinformatics Needs and Algorithms
Historical context of bioinformatics development and its algorithmic requirements
Page 108
Internet and Bioinformatics
Importance of internet access to databases for biological research
Page 109
Bioinformatics Workflow Visualization
Overview of bioinformatics data processing workflow and tools
Page 110
Market Overview for Bioinformatics
Current market valuation and projection for growth in bioinformatics
Page 111
Scope of Bioinformatics Resources
Understanding the resources created for biologists accessing data
Page 112
Critical Database Interactions
Discussion of the interaction between major databases in bioinformatics
Page 113
Specialized Bioinformatics Databases
Examples of specialized databases with links to various resources
Page 114
High-Level Protein Databases
Overview of specific databases focused on protein sequence information
Page 115
Database Homology Searching Techniques
Introduction to algorithms and scoring methodologies for sequence analysis
Page 116
Scoring Systems in Sequence Alignments
Overview of scoring raw scores and matrices in alignments
Page 117
Creation of Scoring Matrices
Methodology for developing scoring matrices to assess sequence similarity
Page 118
Influence of Scoring Matrices
Importance of scoring matrix choice on analysis outcomes
Page 119
Sequence Alignment Methodologies
Differentiation between global and local sequence alignment strategies
Page 120
Algorithm Use in Database Search
Comparative analysis of common algorithms used for similarity searches
Page 121
Overview of Genomic Sequencing by 2002
Summary of progress in genomic sequencing across numerous organisms
Page 122
Comparison Dilemma: DNA vs Protein
Discussion on accuracy in nucleotide vs protein sequence comparisons
Page 123
Implications of Sequence Comparison Approaches
Importance of using appropriate comparison methods based on sequence type
Page 124
BLAST and FASTA Variants
Summary of different variants of search tools for sequence comparison
Page 125
Practical Example of Sequence Analysis
Visualization of NCBI tools for protein analysis and alignment
Page 126
Explanation of E-Value in Sequence Searches
Discussion of E-value implications in assessing search significance
Page 127
Database Searching Recommendations
Guidelines for effective searches in biological databases
Page 128
Popular Bioinformatics Analysis Sites
List of widely used alignment and translation tools in bioinformatics.