KS

Proteomics, MS and Biomedical Applications – Vocabulary

Lecture Context

  • Course/Unit: TAiBMS 6H5Z1036_2425

  • Lecture 2 title: Proteomics, MS and Biomedical Applications: Quantification and Experimental Strategies

  • Lecturers/Contributors: Dr Jon Humphries (presenter), Prof. Zoltan Takats (host institution link provided)

  • Format: Slide-based lecture, page numbers 1–37

Stated Learning Outcomes

  • Describe a typical MS-proteomics workflow

  • Understand quantitative approaches employed in proteomics

  • Recognise research & biomedical applications of quantitative MS proteomics

Proteomics ‑ Recap & Core Concepts

  • Proteomics definition: Large-scale, global study of proteins (analogous to genomics for DNA)

  • Unique difficulty: Proteins cannot be amplified like DNA → sensitivity, dynamic-range issues

  • First use of “proteome”: 1997

  • Proteome = complete set of proteins produced or post-translationally modified by a cell, tissue, organism or system

  • Field is acronym-heavy: SILAC, iTRAQ, TMT, ESI, MALDI, TOF, SRM, MS1, MS2 etc.

  • Proteomics aims to identify AND quantify system components

Generic MS-Proteomics Workflow

  • Inter-dependent steps (all critical):
    • Sample collection / lysis / enrichment
    Enzymatic digestion (classically trypsin)
    Separation (LC or other chromatographies)
    Ionisation (ESI for LC, MALDI for spot-based)
    Mass spectrometry acquisition (MS1 survey → MS2 fragmentation)
    Bioinformatics & statistics (database search, FDR, volcano plots, PCA, clustering, network analysis)

  • Experiment does not stop at the peptide list → downstream biological interpretation essential

Instrument Illustration (LC-MS/MS)
  • Bench-top dual-module: LC left, MS right (Cravatt et al., 2007)

  • MS measures m/z (mass-to-charge ratio) of ionised peptides

Peptide Sequencing Basics (abc/xyz Ions)
  • Peptide chosen in MS1 is fragmented in MS2

  • Backbone breaks primarily at peptide bondsa, b, c (N-terminal) & x, y, z (C-terminal) ion series

  • Resulting spectrum reflects all possible break points; identification is probability-based DB matching

In-class Exercise – Trypsin Rule
  • Trypsin cleaves C-terminal to Lys (K) or Arg (R) unless followed by Pro

  • Sequence example (positions 1–37): QPQPAQNVLA APRGLGAAEF GGKAGNVEAP GETFAQ

  • Expected theoretical peptides: 3 (confirmed via Expasy Peptide Cutter)

Quantitative Proteomics – Strategic Decisions

  1. Relative vs Absolute quantification
    • Most studies are relative (fold-change)
    Absolute (concentration, e.g.
    \text{pmol}/\mu\text{L}) demands external calibration / validation

  2. Label vs Label-Free

  3. Targeted vs Discovery

Label-Based vs Label-Free (LFQ)
  • Label Approaches (heavy isotopes or isobaric tags)
    • Additional cost (15N/13C amino acids, TMT/iTRAQ reagents)
    • Experimental constraints: cell culture easier than whole-animal
    Multiplexing lowers MS run count & reduces prep-derived variability

  • Label-Free
    • Cheaper design, no chemical handling, suits any sample
    • Requires more LC-MS runs; quant accuracy relies on ion intensity (XIC) or spectral counting (the latter less accurate)

Two Major Labelling Families
  • SILAC – Stable Isotope Labelling by Amino acids in Cell Culture
    Metabolic → labelled proteins before extraction
    • Quantification from MS1 peak intensities

  • TMT / iTRAQ – Tandem Mass Tag / Isobaric Tags
    Chemical tagging of peptides post-digestion
    • Quantification from MS2 reporter ions
    • Up to 16-plex nowadays

  • Rule of thumb: equal protein levels give 1:1 heavy/light or tag reporter ratios

Targeted vs Discovery Workflows
  • Discovery (shotgun/DDA)
    • Goal: max coverage
    • Acquisition: precursors selected data-dependently by intensity

  • Targeted (SRM/MRM, PRM, DIA)
    • Focus: predefined peptides → high sensitivity & quantitative precision
    • Classical hardware: triple quadrupole (QQQ)
    • Process:
    – Q1 isolates precursor m/z
    – Collision cell fragments
    – Q3 monitors selected product ions
    • Absolute amounts via heavy synthetic standards
    • Practical multiplex: 50–100 proteins per run

  • DIA / SWATH-MS: Hybrid targeted-like quant without SRM optimisation; all precursors fragmented in sequential m/z windows; identification via spectral libraries

Fundamental Compromise (Targeted vs Discovery)
  • Trade-off triangle:
    Proteome breadth
    Detection sensitivity
    Assay scalability

  • Decide on absolute or relative needs before committing to workflow

Biomedical & Research Applications

Cell-ECM Adhesion & Integrins
  • Integrins = heterodimeric receptors linking cytoskeleton ↔ ECM

  • Control: mechano-signalling, migration, survival, proliferation, differentiation

ECM Production In Vitro (Rashid et al., 2012; Byron et al., 2014)
  • MS used to catalogue & quantify secreted ECM under cell-culture cross-talk

  • Workflows: ECM enrichment → LC-MS/MS → statistics (volcano plots) & protein-protein interaction (PPI) networks

Protein–Protein Interaction Mapping
  • GFP-TRAP IPs (Jacquemet et al., 2013): isolate GFP-tagged small GTPase complexes; output analysed with clustering, heat-maps, network reconstructions

  • Integrin ligand pull-downs (Humphries et al., 2009; Jones et al., 2015): affinity purification vs fibronectin/VCAM ligands → modelling α5β1 and α4β1 adhesome networks

Post-Translational Modifications
  • Phospho-adhesome (Robertson et al., 2015)
    • Enrichment of phosphopeptides from adhesion complexes
    • Revealed far more phosphoproteins than prior estimates
    • Data mined through ontologies & PPI networks

Spatial Proteomics (Proximity Labelling)
  • BioID (Roux et al., 2012; Lundberg & Börner 2019)
    • Mutant BirA* biotin-ligase fused to bait → labels proteins within \sim10\,\text{nm}
    Advantages: in situ, no need to keep interactions intact, reveals nano-topology
    Disadvantage: genetic fusion/expression needed
    • Alternative enzymes: APEX peroxidase, TurboID etc.

  • BioID-generated adhesome (Chastney et al., 2020)
    • 16 bait proteins; LFQ via MaxQuant + SAINT
    • Identified 146 enriched proteins → 360 proximity edges, 81\% previously unreported (BioGRID)
    • Combined hierarchical clustering with network topology

Cancer Diagnostics & Therapeutics
  • Tumour micro-environment (Carr & Fernandez-Zapico, 2016): stroma, fibroblasts, immune cells yield multiple biomarker sources (plasma, biopsy, liquid biopsy, histology)

  • Need markers for entire patient journey: predisposition → early detection → personalised therapy

iKnife (REIMS Technology)
  • Surgical diathermy coupled to rapid-evaporative ionisation MS

  • Classifier trained on tumour vs normal tissue “fingerprints”

  • Advantage: real-time guidance during resection; note does not measure proteins per se

MS Imaging
  • Discovery mode molecular imaging; comparatively low spatial resolution

  • Produces ion maps without explicit biomolecule ID (requires orthogonal validation)

Clinical Proteomics Workflow Snapshot (Zhu et al., 2021)

  • Workflow encompasses:

    1. Sample selection (tissue, fluid, FFPE, cell culture)

    2. Protein extraction & clean-up

    3. Separation (SDS-PAGE, SEC, OFFGEL, LC)

    4. Digestion (trypsin, Lys-C, etc.)

    5. Optional labelling/enrichment steps (SILAC, TMT, PTM enrichment)

    6. LC runtime (nanoLC/UHPLC)

    7. MS acquisition (Orbitrap, Q-Exactive, TOF, QQQ)

    8. Identification & Quantification (search engines, FDR)

    9. Bioinformatics (stat tests, pathway, network, machine learning)

Strengths & Weaknesses of Quantitative Workflows (General)

  • Label-based: high precision, multiplex, lower run count; but costly & sample-mixing complexity

  • Label-free: universal applicability, cost-effective; but run-to-run variability & larger instrument time

  • Targeted: exquisite sensitivity & absolute-quant option; but limited breadth & assay development overhead (unless DIA)

  • Discovery: global view & hypothesis generation; but semi-quantitative & under-samples low-abundance proteins

Key Numeric / Technical References

  • Typical SRM panel size: 50\text{–}100 proteins

  • BioID labelling radius: \sim10\,\text{nm}

  • BioID adhesome: 146 enriched proteins; 360 edges; 81\% novel

  • TMT plexing currently up to 16

Ethical, Philosophical & Practical Implications

  • Clinical translation requires balancing experimental rigour with cost, throughput, and regulatory demands

  • Quantification strategy influences data reproducibility and biological interpretability

  • Patient benefit (e.g. iKnife) hinges on robust training datasets & ongoing validation

Essential & Recommended Reading (as per slide 37)

  • Essential:
    • Steen & Mann (2004) “The abc's (and xyz's) of peptide sequencing”
    • Zhu et al. (2021) “SnapShot: Clinical proteomics”

  • Recommended:
    • Cravatt et al. (2007) – Biological impact of MS proteomics
    • Doerr (2013) – Targeted proteomics
    • Lundberg & Börner (2019) – Spatial proteomics
    • Samavarchi-Tehrani, Gingras (2020) – Proximity biotinylation

  • Additional cited open-access studies embedded throughout lecture

Recap of Learning Outcomes Achieved

  • Detailed breakdown of MS-proteomics workflow (sample → bioinformatics)

  • Exhaustive comparison of quantitative strategies (relative/absolute; label/label-free; targeted/discovery)

  • Multiple research & biomedical case studies: ECM/integrin biology, phospho-adhesome, BioID spatial mapping, cancer diagnostics, iKnife, imaging

Closing Prompts

  • Revisit any section for clarification?

  • Think about experimental requirements (precision, depth, cost, speed) when designing your own proteomics study.