BIOINFORMATICS (Module1).ppt

Page 1

BIOINFORMATICS

Module 1 Introduction

Page 2

"Artificial Life" Breakthrough

Craig Venter announces potential creation of the first artificial life form
Aimed to combat illness and global warming
Synthetic chromosome built from laboratory-made chemicals
Upcoming announcement expected soon

Page 3

Nobel Prize in Physiology or Medicine 2005

Awarded for the discovery of Helicobacter pylori’s role in gastritis and peptic ulcer disease
Recipients: Barry J. Marshall and J. Robin Warren

Page 4

Nobel Prize in Physiology or Medicine 2006

Awarded for the discovery of RNA interference
Recipients: Andrew Z. Fire and Craig C. Mello

Page 5

Nobel Prize in Physiology or Medicine 2007

Awarded for discoveries related to gene modifications in mice via embryonic stem cells
Recipients: Mario R. Capecchi, Oliver Smithies, and Sir Martin J. Evans

Page 6

Molecular Biology Advances

Over 500 years worth of research challenges in biology
DNA structure discovery in 1953 paved the way for molecular biology advancements
Increasing biological data requires interdisciplinary approaches involving math and computer science
Emergence of Computational Molecular Biology and Bioinformatics as fields

Page 7

Commercial Market Overview

Current bioinformatics market valued at $300 million/year
Predicted to grow to $2 billion/year in 5-6 years
Key bioinformatics companies include:
- Genomatrix Software, Genaissance Pharmaceuticals, DeCode Genetics, etc.

Page 8

Computational Molecular Biology & Bioinformatics

Combination of computer science and mathematical techniques to solve molecular biology issues

Page 9

Bioinformatics Units

Basic Concepts
Suffix Trees and Applications
Sequence Alignment: Pairwise Alignment, Multiple Alignments
Sequencing
Motif Prediction

Page 10

Unit 1: Basic Concepts of Molecular Biology

Focus on Cellular Architecture, Nucleic Acids (RNA & DNA), DNA replication, repair, and recombination
Understanding transcriptions, genetic codes, and protein structures
Statistical methods including estimation, hypothesis testing, and Markov models

Page 11

Unit 2: Suffix Trees

Definition, examples, and algorithms (Ukkonen’s linear-time)
Applications include exact string matching, longest common sub-strings
Understanding pairwise sequence alignment (edit distances, dynamic programming)

Page 12

Unit 3: Sequence Alignment

Local pairwise sequence alignment
Need and methodology for multiple sequence alignments
Searches for similar sequences in databases (using FASTA, BLAST)

Page 13

Unit 4: Sequencing

Techniques including fragment assembly and sequencing by hybridization

Page 14

Unit 5: Motif Prediction

Motif prediction processes and methods for protein structure prediction

Page 15

Recommended Books

"Algorithms on Strings, Trees and Sequences" by Dan Gusfield
"Introduction to Computational Molecular Biology" by J. Setubal & Meidanis
"Statistical Methods in Bioinformatics" by W.J. Ewens & G.R. Grant

Page 16-18

Continuation of Recommended Literature

Works by R. Durbin et al., N.C. Jones & P.A. Pevzner, D.E. Krane et al., and more

Page 19

Class 2 Introduction

Page 20

Unit 1 Focus

Key elements: DNA, RNA, Protein, Genetic Code

Page 21

Craig Venter's Breakthrough

Overview of Venter’s creation of Mycoplasma laboratorium and its implications for global warming mitigation

Page 22

Basics of Genetics

One cell contains a copy of the genome (blueprint for individual traits)
Discussion of chromosomes as chapters in a genomic book containing genes

Page 23

Venter's Language Understanding

Insights into how Venter comprehended genetic coding language

Page 24

Venter's Genetic Advancements

Development of a synthetic chromosome with 381 genes to produce new life forms

Page 25

Advancements in Genome Creation

Historical context of genomic research culminating in Venter's achievements

Page 26

Bioinformatics Need Emergence

Clarifying the urgency in bioinformatics due to increased biological data complexity

Page 27

Computational Molecular Biology Explanation

Emphasis on CMB combining computer science and biology for problem solving

Page 28

Living vs. Nonliving

The distinctions based on movement, reproduction, and environmental interaction

Page 29

Characteristics of Living Organisms

The role of chemical reactions in sustaining life and the interaction with surroundings

Page 30

Origins of Life

Life began approximately 3.5 billion years ago, evolving into diverse forms compatible with earth's molecular chemistry

Page 31

Key Biological Molecules

Proteins define physical traits while nucleic acids convey genetic information

Page 32

Functions of Proteins

Various roles proteins play, such as enzymes, transport molecules, and cellular structure builders

Page 33

Amino Acids Overview

Explanation of hydrophobic and hydrophilic amino acid properties in protein construction

Page 34

Polypeptide Chain Structure

Description of polypeptide orientation from N-terminal to C-terminal

Page 35

Protein Structure Types

Different levels of protein structure: primary, secondary, tertiary, and quaternary

Page 36

Basic Genomic Code

Exploration of mRNA codon mapping and relationship to amino acids

Page 37

DNA Fundamentals

Definition of DNA structure focusing on nucleotides and base pairing

Page 38

DNA Molecular Structure

Description of DNA as a double-stranded helix with sugar-phosphate backbones

Page 39

Nucleotide Components

Components of nucleotides: sugars, phosphates, and nitrogenous bases

Page 40

Purines vs Pyrimidines

Explanation of base types in nucleotides (A, G as purines; C, T as pyrimidines)

Page 41

Complementary Base Pairing

Overview of Watson-Crick base pairing rules in DNA structure

Page 42

RNA Overview

Discussion on the structure and function of RNA compared to DNA

Page 43

Key Differences: RNA and DNA

Comparative analysis of DNA and RNA based on structure and roles in protein synthesis

Page 44

Class 3 Introduction

Page 45

Central Dogma of Molecular Biology

The flow of genetic information: DNA -> RNA -> Protein

Page 46

Transcription and Translation Processes

Overview of gene transcription to mRNA and subsequent translation to protein

Page 47

Intron-Exon Dynamics

Description of splicing introns from mRNA prior to protein synthesis

Page 48

Summary of Central Dogma

Visualization of transcription and translation processes from DNA to protein

Page 49

Concept of Junk DNA

Understanding of genetic regions without clear function termed as "junk DNA"

Page 50

Open Reading Frame (ORF) Definition

Description of ORF in DNA sequence and its significance in translation

Page 51

Genome Definition

Complete set of chromosomes characterizing species, with examples from humans and mice

Page 52

Genome as a Computer Program Analogy

Genome equated to a computer program governing organism functionality

Page 53

Class 4 Introduction

Page 54

Eye Development Gene Studies

Case study on the eyeless gene in fruit flies and its human counterpart

Page 55

Gene Function Comparison

Exploring functional similarities between eyeless and aniridia genes across species

Page 56

Historical Context of Sequence Analysis

Evolution of sequence analysis from manual methods to computer-assisted techniques

Page 57

Bioinformatics Tools Evolution

Advancements in software tools significantly impacting molecular biology practices

Page 58

Genome Study Techniques

Overview of sequencing and its challenges in studying human genetic materials

Page 59

Cutting and Manipulating DNA

Usage of restriction enzymes as tools for DNA manipulation

Page 60

DNA Cloning Processes

Methods of copying DNA using host organisms for amplification

Page 61

DNA Analysis Techniques

Gel electrophoresis as a primary method for DNA fragment analysis

Page 62

Overview of the Human Genome Project

Page 63

HGP Components

Multi-disciplinary research involving chemistry, biology, engineering, physics, ethics, informatics

Page 64

Objectives of the Human Genome Project

Aims to identify human genes, sequence the human genome, and address ethical concerns

Page 65

DOE Involvement in HGP

Historical context linking radiation studies to genome research

Page 66

Reference Genome Composition

First reference genome made from multiple individual samples across ethnicities

Page 67

Benefits of HGP Research

Advancements in medicine, agriculture, forensic science, and evolutionary biology

Page 68

Ethical Implications in HGP

Addressing concerns involving genetic data privacy, testing, and social issues

Page 69

Further HGP Information

Page 70

Collaborative Nature of HGP

Importance of databases and computational analysis for genome research

Page 71

Genetic Disease Treatment Advances

Pioneering results emerging from HGP data application for disease treatment

Page 72

Class 5 Introduction

Page 73

Understanding Databases

Definition and significance of databases in biological research

Page 74

History of Biological Databases

Timeline of significant developments in biological database systems

Page 75

Functions of Biological Databases

Roles of databases in data accessibility and computational research needs

Page 76

Database Types Overview

Different classes of biological databases based on data types and entry methods

Page 77

Data Quality Control Mechanisms

Importance of data curation and validation in biological databases

Page 78

Database Technical Design

Various database architectures employed in managing biological data

Page 79

Accession Codes and Identifiers

Explanation of how database entries are uniquely defined and identified

Page 80

Identifier Characteristics

Discussion on the nature of identifiers in database entries

Page 81

Accession Code Stability

Importance of stable accession codes for consistent entry tracking

Page 82

Primary Nucleotide Sequence Databases

Key examples (EMBL, GenBank, DDBJ) and their characteristics

Page 83

Detailed Description of Databases

Overview of EMBL, GenBank, DDBJ operational roles in sequencing data management

Page 84

Secondary Nucleotide Sequence Databases

Explanation of databases that build upon primary data for enhanced features

Page 85

Protein Sequence Databases Overview

Distinction of curated databases focusing on protein sequences

Page 86

SWISS-PROT vs PIR

Comparison of two notable protein databases, with emphasis on annotation quality

Page 87

PIR Database Insights

Overview of the Protein Information Resource’s capabilities and history

Page 88

Other Relevant Databases

Examples of databases catering to specific biological or genetic information needs

Page 89

Popular Biological Databases

Overview of well-regarded databases for ease of access and information consolidation

Page 90

Bioinformatics Database Resources

List of popular bioinformatics database websites for research and analysis

Page 91

Growth of GenBank

Visualization of the expansion of the GenBank database over time

Page 92

NCBI Overview

History, mission, and role in public databases and computational biology

Page 93

NCBI Database Overview

List of various NCBI database offerings for nucleotides and proteins

Page 94

Nucleotide Database Components

Comprehensive overview of available Sequence databases at NCBI

Page 95

NCBI Database Types

Differentiation between primary and derivative databases in the NCBI framework

Page 96

Entrez Database Search Engine

Summary of capabilities provided by NCBI's Entrez search engine

Page 97

Literature and Text Resource

Access to biomedical literature and related databases at NCBI

Page 98

Overview of Nucleotide Databases

Summary of primary nucleotide database statistics

Page 99

EMBL/GenBank/DDBJ Collaborative Nature

Description of how these databases synchronize sequences and data

Page 100

Protein Databases Overview

Insight into the features of major protein databases

Page 101

Secondary Protein Database Insights

Details on SWISS-PROT and PIR's notable features, advantages, and uses

Page 102

UniProt Description

Overview of UniProt as an extensive protein information repository

Page 103

NCBI Derivative Sequence Data

Example genetic sequences illustrating NCBI data curation methods

Page 104

High-throughput DNA Sequencing Visualization

Images depicting sequences and technological advancements in sequencing

Page 105

Data Growth in Bioinformatics

Trends in biotechnology and implications for computational bioinformatics

Page 106

Managing Information Overload

The role of bioinformatics in processing large amounts of biological data

Page 107

Bioinformatics Needs and Algorithms

Historical context of bioinformatics development and its algorithmic requirements

Page 108

Internet and Bioinformatics

Importance of internet access to databases for biological research

Page 109

Bioinformatics Workflow Visualization

Overview of bioinformatics data processing workflow and tools

Page 110

Market Overview for Bioinformatics

Current market valuation and projection for growth in bioinformatics

Page 111

Scope of Bioinformatics Resources

Understanding the resources created for biologists accessing data

Page 112

Critical Database Interactions

Discussion of the interaction between major databases in bioinformatics

Page 113

Specialized Bioinformatics Databases

Examples of specialized databases with links to various resources

Page 114

High-Level Protein Databases

Overview of specific databases focused on protein sequence information

Page 115

Database Homology Searching Techniques

Introduction to algorithms and scoring methodologies for sequence analysis

Page 116

Scoring Systems in Sequence Alignments

Overview of scoring raw scores and matrices in alignments

Page 117

Creation of Scoring Matrices

Methodology for developing scoring matrices to assess sequence similarity

Page 118

Influence of Scoring Matrices

Importance of scoring matrix choice on analysis outcomes

Page 119

Sequence Alignment Methodologies

Differentiation between global and local sequence alignment strategies

Page 120

Algorithm Use in Database Search

Comparative analysis of common algorithms used for similarity searches

Page 121

Overview of Genomic Sequencing by 2002

Summary of progress in genomic sequencing across numerous organisms

Page 122

Comparison Dilemma: DNA vs Protein

Discussion on accuracy in nucleotide vs protein sequence comparisons

Page 123

Implications of Sequence Comparison Approaches

Importance of using appropriate comparison methods based on sequence type

Page 124

BLAST and FASTA Variants

Summary of different variants of search tools for sequence comparison

Page 125

Practical Example of Sequence Analysis

Visualization of NCBI tools for protein analysis and alignment

Page 126

Explanation of E-Value in Sequence Searches

Discussion of E-value implications in assessing search significance

Page 127

Database Searching Recommendations

Guidelines for effective searches in biological databases

Page 128

Popular Bioinformatics Analysis Sites

List of widely used alignment and translation tools in bioinformatics.