Introduction to Bioinformatics

Bioinformatics and Computational Biology

Bioinformatics analyzes raw biological data, particularly from transcript to translation of DNA. Computational biology simulates biological processes, including molecular processes at the atomic and electronic levels. Bioinformatics involves data processing, storage in databases, and statistical analysis due to the large amounts of sequence data (reads). Genomics focuses on variants, annotations, methylation profiles, and understanding the impact of mutations (pathogenic, neutral, or beneficial), leveraging databases like ClinVar.

Data and Databases

ClinVar: A database for human genomic variations, detailing the pathogenicity or neutrality of mutations.
NCBI: A comprehensive database for protein sequences, CDS (Coding Sequences), and genome sequences.
RCSB, UniProt: Databases specifically for protein sequences, but may also contain RNA and other nucleic acids.
FHIR: Used for interoperability between clinical data systems, like those being developed by Sapulsehat in apps, aiming to integrate clinic data in Indonesia; still in development.

Computational Biology

Computational biology includes molecular docking, which simulates molecule interactions, requiring significant computational power. In vitro analysis remains essential for validation, but computational methods can streamline and improve efficiency. Key considerations for drug candidates include:

ADMET: Absorption, Distribution, Metabolism, Excretion, and Toxicity.
Lipinski's Rule of Five: Evaluates drug-likeness based on molecular size, hydrogen bond donors/acceptors, and water coefficient. Quantitative measure of drug-likeness of a potential compound; based on molecular weight, lipophilicity, hydrogen bond donors and acceptors.
Bioavailability: Measures how quickly a drug reaches systemic circulation (1 = immediate, 0 = impossible). Intravenous injection leads to a bioavailability score of 1.
Synthetic Accessibility: Assesses the ease of synthesizing a compound (scale of 1 to 10, with 1 being easiest). Considers how rare a compound is within natural sources like plants.

Molecular Docking and Dynamics

Molecular docking is a semi-statistical process used for drug screening, allowing flexibility in ligand positioning. Programs like AutoDock Vina allow flexible docking but often treat the protein receptor as static. The goal is to identify binding sites and binding affinity between ligands and receptors (proteins, RNA, etc.).

Types of Docking:

Forward Docking: Testing multiple compounds against a single protein target.
Reverse Docking: Testing numerous proteins against a single drug.
Hybrid Docking: Interchangeably testing various compounds and proteins.

Molecular Dynamics and Mechanics

Molecular dynamics simulates the time-dependent interaction using force fields, solvent effects, and neutralizing ions (e.g., NaCl). Simulations typically run for around 20 nanoseconds. Molecular mechanics methods, such as Generalized Born Surface Area (GBSA) or Poisson-Boltzmann Surface Area (PBSA), can follow up molecular dynamics; PBSA is considered more accurate.

Density Functional Theory (DFT)

DFT maps and approximates the position and movement of electrons within a molecule. The electron density area indicates the probability of finding electrons. This method is used to: assess the frontier molecular orbital (FMO) energy gap to determine compound stability and analyze the excitation profile of compounds. Simulations can be performed in gas phase or with solvents (water, DMSO, etc.).

DFT is used to reduce the number of compounds needed for in vitro screening, potentially cutting down hundreds of compounds to a more manageable number of around ten.

Rafflesia Example

Rafflesia is a parasitic plant from three genera (Rafflesia, Sapria, and Rhizantis) that parasitizes Tetrastigma vines of the grape family, lacking leaves, roots, and stems. Local populations boil dried Rafflesia blooms, believing it aids in postpartum bleeding, lowers cholesterol, treats hemorrhages, and acts as an aphrodisiac. Analysis of Rafflesia reveals:

Contains 21 metabolites, including alkaloids, caffeine, and nicotine.
Extracts show antioxidant, wound healing, and antibacterial properties.
Has compounds promoting anti-cholesterol, anti-influenza, antifungal, wound healing, and anticancer activities.
The highest affinity is for HMGCR (cholesterol enzyme), acetylcholine esterase (dementia drugs), and VEGFR2 (breast cancer growth factor).

Alternative compounds with similar benefits are found in coffee, matcha, berries, pecan, and apple, allowing for Rafflesia conservation.

Mimba (Neem) Example

Mimba (Azadirachta indica), or Neem, is studied for anti-malarial properties due to increasing quinine resistance. The goal is to find accessible, low-cost natural sources for quinine-like derivatives. The process involves:

Metabolomics analysis to identify quinine-like compounds.
Screening and DFT analysis.
Identification of Compound C as a promising candidate by DFT.
Evaluation of HOMO (highest occupied molecular orbital) and LUMO (lowest unoccupied molecular orbital) energy gaps to assess stability. Larger gaps indicate more stable compounds.

Nuclear docking can involve RNA targets, specifically the programmed ribosomal frame shift (PRF) element found in viruses. The PRF allows viruses to produce more proteins by shifting the reading frame during translation. Compounds that disrupt or interfere with the PRF mechanism are of interest. Examples of interfering compounds include:

Alkaloids: Berberine (from Berberis vulgaris), colchicine (Colchicum autumnale), nicotine (Nicotiana tabacum), and tomatine (tomato plants).
Control Drug: Mirafloxacin, also targeting PRF.

Berberine shows promising ADMET scores and binding affinity, while tomatine, though effective, has poor synthetic accessibility (score of 10) and high molecular weight.

Artificial Intelligence (AI) in Drug Discovery

AI tools, like GPT, can be used to generate compounds, but require specific problem definitions to avoid "hallucinations." Tools like AlphaFold (Google DeepMind) can predict protein structures, even with mutations or missing peptides. This is useful for assessing functional alterations. However, AI requires careful validation with experimental data. AI improves drug screening by making the process more efficient and cost-effective for inhibitory drug testing. Uses DFT for recomposability based on electron density maps; ATME for assessing drug-likeness; and PASS for activity screening.

Questions and Answers

Question 1: How to Minimize Drug Interaction with Other Proteins?

Drugs inevitably interact with multiple proteins. The goal is to minimize unintended interactions by:

Ensuring that desired traits (e.g., DNA interference) are maintained.
Targeting RNA locally to specific areas.

Question 2: Application of Bioinformatics in Drug Discovery

Bioinformatics supports drug discovery by:

Using databases to access existing compounds and testing them in silico before in vitro.
Annotating metabolites from compound metabolic analysis using databases like ChEMBL, hmdb to identify compounds similar to target drugs. This may involve converting the structure to PDB format and then testing it.
Finding alternative drug solutions using natural compounds or repurposing existing drugs.

Question 3: How to Test Drugs for Anti-Cholesterol and Anti-Cancer Properties

Identify the protein target associated with the disease (e.g., HMGCR for cholesterol inhibition).
Collect protein data (PDB) for the receptor.
Dock compounds to the protein to see if effects are similar to known drugs (e.g., statins).
Compare binding affinities to conclude potential.

Question 4: Validating Molecular Docking Results

Cytotoxicity tests can confirm the alignment of in silico results with in vitro data. While molecular docking helps streamline the process and reduce testing, drugs still have a long journey to clinical trials.

Question 5: Differences in Targeting RNA vs. Protein

Protein: The target is usually the active site of the protein. This is well defined.
RNA: Can be random, requiring identification of active sites, or other areas in the sequence to be the target. Needs conversion to PDB format for preprocessing.
For molecular dynamics, force field selection differs (Amber or CHARMM27 can be used for RNA).

Question 6: Impact of AI-Based Drug Discovery

AI-based drug discovery makes progress more efficient but requires stronger supervision to ensure accuracy. Starting now is key to advancing this field in Indonesia.

Question 7: Designing Compounds for Covalent Docking

Nature of docking is static. Need to test covalent modifications to determine the impact of the best course of action.

Question 8: Indonesia's Regulation on Genomics and Drug Discovery

There is a need for better regulation with additional funding to support drug discovery based on proper testing and clinical trials before human use.

Question 9: Confirmation of Insilico Compound by Testing

Failure is possible, with an objective being to improve processes and lower percentage of bad possibilities. In Vitro combined with Insilico give the best results.

Question 10: Drug Candidate from Animal or Plant Sources

For seaweed-based compounds, the size of the compound matters in drug discovery to have good binding strength (the compound shouldn't be too big that it gets repulsed). Dynamics testing is good to use to test dynamics versus a static picture of what the docking is.

Prospect of Drug Discovery in Indonesia

Indonesia has a large amount of biodiversity which provides opportunities for research to discover the different species to create something to learn. Further drug screening, to have new prospective research.