Pathway Analysis and Drug Discovery Lecture Notes
Introduction
- This lecture – traditionally delivered near July 4th in the Biotechnology program – focuses on Pathway Analysis & Drug Discovery.
- Over-arching aim: show how understanding complex molecular pathways can be leveraged to identify, validate and optimize new therapeutic drugs.
What Is a “Pathway”?
- No single, universally accepted definition.
- Wikipedia (biochemistry perspective): a metabolic pathway is “a series of chemical reactions occurring within a cell”, in which a principal molecule is successively modified by enzymes.
- Practical research usage extends the term to:
- Protein–protein or protein–ligand interaction networks.
- Gene-regulatory or signal-transduction networks.
- Functional core: a pathway describes one specific biological function, linking molecules, reactions and regulations to a phenotypic outcome.
- Iconic illustration: posters of the cell-cycle/cell-proliferation map showing hundreds of interacting proteins and gene products; serves as a reminder of biological complexity.
Pathways vs. Gene Sets
- Gene Set (GS)
- Collection of genes grouped for a reason.
- E.g., all genes in a GO term, all genes measured in an assay, or all genes co-expressed in a cluster.
- Pathway (PW)
- Sub-type of gene set with well-defined mechanistic interactions and order.
- Relationship: All genes in a pathway ∈ a gene set, but not every gene set constitutes a pathway.
- In literature & software, terminology sometimes blurred – students must confirm context.
Gene Ontology (GO) – The Standard Vocabulary
- GO is an international, curated bioinformatics effort that standardises gene/product attributes across species & databases.
- Three ontologies (hierarchical trees):
- Biological Process (BP) – e.g., angiogenesis, glycolysis, cell cycle.
- Molecular Function (MF) – e.g., ATP binding, kinase activity.
- Cellular Component (CC) – e.g., nucleus, ribosome, plasma membrane.
- GO’s web interface lets researchers input a gene and explore all GO terms it maps to; links out to KEGG, IPA and other pathway resources.
The Grand Challenge in Drug Discovery
- “Systemic generation of novel biological & therapeutic insights.”
- Sequential workflow:
- Gene discovery – find genes implicated in a disease.
- Pathway/process mapping – place those genes in cellular context via GO, KEGG, IPA, etc.
- Mechanistic hypothesis – propose how gene perturbation drives phenotype.
- Experimental validation – in vitro & in vivo assays.
- Target–lead identification – map druggable nodes and screen for chemical leads.
- Pharmacology & toxicology – study xenobiotic interactions, side-effects, pathologies.
Step 1 – Building the Initial Gene List
- Comparative studies
- Microarray, RNA-Seq, qRT-PCR, proteomics: compare normal vs. disease tissue.
- Clustering/Classifications
- Identify co-expressed gene clusters; changes often occur in concert.
- Homology analysis
- Use BLAST to locate orthologs already implicated in other species or disorders.
- “Any source you can think of” – literature mining, CRISPR screens, GWAS hits, etc.
Step 2 – Enrichment & Commonality Analysis
- Questions asked of the gene list:
- Do genes share molecular functions, biological processes, or cellular compartments?
- Are they annotated to a common pathway?
- Do they share TF-binding sites, miRNA targets, protein domains?
- Are they co-mentioned in disease databases?
- GO/Pathway Enrichment Analysis
- Quantifies whether overlaps are greater than expected by chance.
- Classic statistic: Fisher’s Exact Test.
p=a!b!c!d!n!(a+b)!(c+d)!(a+c)!(b+d)!
where a,b,c,d fill the 2×2 contingency table and n is total genes. - Modern tools perform thousands of such tests automatically and adjust p for multiple hypotheses.
Software & Databases for Pathway Analysis
- Two major business models:
- Free / Community-Driven
- Rely on volunteer curation; broad accessibility.
- Examples: GO website, Reactome, STRING, Cytoscape plug-ins.
- Commercial / Subscription
- Paid staff perform manual curation, QA & continuous updates; generally higher confidence.
- Examples: Ingenuity Pathway Analysis (IPA), MetaCore, KEGG (now paid).
Notable Databases Highlighted in Lecture
- PathwayGuide.org
- Rapidly expanding: grew from 222 pathways ~6 y ago to 702 (checked this morning).
- Contents: PPI networks, metabolic/signalling maps, TF interactions, protein–compound links, gene-interaction networks.
- KEGG (Kyoto Encyclopedia of Genes & Genomes)
- Now licensed via Pathway Solutions; offers academic & commercial plans.
- Produces colourful pathway maps. Example given: Tryptophan Metabolism – every arrow a potential drug-target node.
Illustrative Pathway Maps Discussed
- Tryptophan metabolism – highlights multiple enzymes/cofactors; drug leads could inhibit or enhance any step.
- Angiogenesis map – critical in tumour vascularisation; anti-angiogenic drugs (e.g., VEGF inhibitors) exploit nodes here.
- Uterine smooth-muscle contraction – important for labour pharmacology; each signalling molecule is a conceivable obstetric drug target.
Why Organise Data into Pathways?
- Reveals biological meaning behind joint expression changes.
- Groups genes/proteins into manageable themes rather than isolated hits.
- Pinpoints crucial intervention points where a drug can modify outcome.
- Recognises that biology is redundant & robust – often several paralogous genes perform similar roles → multiple therapeutic “bites at the apple”.
Practical/Philosophical Implications
- Drug discovery is no longer “one gene → one drug” but “network → modulator”.
- Requires integration of wet-lab assays, in silico modelling, and statistical genomics.
- Ethical dimension: better pathway understanding may reduce late-stage drug failures, saving cost, time, and patient risk.
Key Takeaways
- A pathway captures a functional molecular narrative; a gene set is any purposeful list of genes.
- Robust drug discovery pipelines begin with accurate gene lists, expand to enriched pathways, and culminate in validated targets/leads.
- Bioinformatics databases (GO, PathwayGuide, KEGG, IPA) are indispensable – know their strengths, weaknesses & licensing terms.
- Statistical enrichment (e.g., Fisher’s Exact) underpins confidence that observed overlaps are not random.
- Every interaction arrow on a pathway map can be envisioned as a potential therapeutic lever – the art is picking which lever to pull.