Bioinformatics Overview

Definition: Bioinformatics encompasses all aspects of gathering, storing, handling, analyzing, interpreting, and disseminating vast amounts of biological information using databases.
- Includes data on gene sequences, biological activities, pharmacological activities, biological structures, molecular structures, protein interactions, and gene expression.
- Utilizes powerful computers and statistical methods for research purposes, e.g., discovering new pharmaceuticals or herbicides.

Mathematics and Statistics: Necessary for data analysis and interpretation.
Biology: Provides the foundational knowledge for understanding biological processes and data.
Computer Science: Essential for developing algorithms and managing complex biological data sets.

Growth Areas:
- Molecular Biology and Genetics: Integral for research and application.
- Phylogenetics and Evolutionary Biology: Understanding the evolutionary relationships among species.
- Biotechnology Applications: Particularly in pharmaceuticals and microbiology.
- Medicine: Personalized medicine and disease research.
- Agriculture: Enhancements in crop resilience and productivity.
- Eco-management: Environmental conservation efforts.
Current Trends:
- Exponential investment growth in bioinformatics.
- Continuous demand for trained professionals.
- Diversified applications across multiple biological sectors.

Central Dogma of Molecular Biology: Describes the flow of genetic information.
- Genotype: Genetic makeup (e.g., Aa).
- Phenotype: Observable traits (e.g., pink flower).
- Key processes include: transcription, translation, and replication.
Genetic Code:
- Amino acids coded by codons (triplets of nucleotides).
- 64 codons correspond to 20 amino acids, demonstrating degeneracy.
- Deletions or insertions can alter the reading frame and disrupt protein production.

20 Common Amino Acids in Living Organisms:
- Each amino acid has a 3-letter and 1-letter abbreviation (e.g., Alanine - Ala, A).
Protein Structure: Example of Green Fluorescent Protein (GFP) with specified amino acid sequence.

Organization:
- Genome: Nuclear DNA in chromosomes (23 pairs).
- Genes: Approximately 30,000 genes in the human genome, representing a small fraction.
- Nucleotides: Over 3 billion base pairs.
Eukaryotic Gene Complexity:
- Genes consist of promoters, exons, and introns, indicating points of protein coding.

1972: Establishment of the first biological database (Protein Identification Resource) by Margaret Dayhoff.
- Organized proteins into families based on sequence similarity.
1979: The first DNA database was created, leading to prominent databases like GenBank.
Important Developments:
- Sequence retrieval methods and alignment principles in the 1980s.
- Prediction of RNA and protein structures.
- Introduction of BLAST and FASTA methods for database searches.
- Efforts in genome analysis and gene prediction in subsequent years.

Data Management:
- Collection, retrieval, and storage of biological data.
- Alignment methods for comparing sequences.
Prediction and Classification Tasks:
- Secondary and 3D structure prediction of proteins/RNA, gene prediction, phylogeny reconstruction.

Understanding Requirements:
- A fundamental grasp of molecular biology principles and some mathematical/computer science background.
- Emphasis on computational methods and analysis rather than complex algorithms.
- Hands-on experience alongside theoretical knowledge is essential for practical applications in bioinformatics.