Proteomics and Mass Spectrometry – Core Vocabulary
Defining the Proteome
The proteome = the entire set of proteins expressed by a cell, tissue, organ, organism, or biological system at a given time.
Dynamic; changes with development, environmental cues, disease states.
Complements the genome: while the genome is relatively constant, the proteome reflects real-time cellular state and function.
Hierarchical views
Core (constitutive) proteome
Conditional proteome (stimulus/disease-specific)
Post-translationally modified (PTM) sub-proteomes (phospho-, glyco-, acetyl-, ubiquitin-omes, etc.)
Importance of Proteomics in Biomedical Sciences
Fills the gap between genotype and phenotype.
Identifies biomarkers for diagnosis/prognosis (e.g. \text{PSA} in prostate cancer, \text{HbA}_1c in diabetes).
Maps drug targets & off-target effects (precision medicine).
Deciphers signaling pathways and networks → system-level understanding.
Monitors therapeutic response and resistance mechanisms in real time.
Enables discovery of PTMs that regulate protein activity.
Supports development of vaccines and anti-microbial strategies by mapping antigenic proteins.
Experimental Strategies for Proteome Analysis
Broad categories: gel-based and gel-free (MS-based) methods.
Two-Dimensional Gel Electrophoresis (2D-GE)
Separation principle:
First dimension (IEF): proteins separated by isoelectric point pI (where net charge = 0). pI = \frac{pK{a1} + pK{a2}}{2} for simple ampholytes.
Second dimension (SDS-PAGE): separation by molecular weight (M_w).
Visualisation: Coomassie, silver, or fluorescent stains.
Pros: high resolving power, can detect PTM-induced shifts.
Cons: low throughput, hydrophobic & very large/small proteins under-represented, labor-intensive spot excision → MS.
Two-Dimensional Difference Gel Electrophoresis (2D-DIGE)
Samples pre-labeled with spectrally distinct fluorescent dyes (e.g. Cy2, Cy3, Cy5).
Mixed & run on one gel → reduces gel-to-gel variability.
Differential expression calculated from fluorescence intensity ratios.
Sensitivity down to low-nanogram range; multiplexing (up to 3-4 samples per gel).
Mass Spectrometry (MS)
Central analytical tool for modern proteomics. Measures mass-to-charge ratio m/z
\frac{m}{z} = \frac{M + nH}{n} (where M = neutral mass, n = charge state, H = proton mass ≈ 1.00728 Da).Advantages: high sensitivity (zeptomole-attomole), high mass accuracy (<1 ppm in Orbitraps), amenable to complex mixtures.
Can identify, quantify, and characterize PTMs.
How Mass Spectrometry Works (Context of Proteomics)
Ionisation
Electrospray Ionisation (ESI): produces multiply charged ions, compatible with LC-MS online coupling.
Matrix-Assisted Laser Desorption/Ionisation (MALDI): mainly singly charged ions; plate-based.
Mass Analysers
Time-of-Flight (TOF): t = \frac{L}{\sqrt{2zV/m}} (flight time proportional to \sqrt{m/z}).
Quadrupole: mass filtering via oscillating electric fields.
Orbitrap: harmonic electrostatic trap; frequency f \propto 1/\sqrt{m/z}.
Ion Trap (3D or linear), FT-ICR, Q-TOF hybrids.
Detectors: electron multiplier, image current detection (Orbitrap/FT-ICR).
Tandem MS (MS/MS): isolates precursor ion, fragments (CID, HCD, ETD), records fragment spectrum → sequence inference.
Database search algorithms: Mascot, Sequest, Andromeda; match experimental MS/MS to in-silico digests.
False Discovery Rate (FDR) control: target-decoy strategy (<1 % typical).
Proteomic Workflow Overview (Lecture 2)
Biological Question & Experimental Design.
Sample Collection & Lysis.
Protein Extraction & Cleanup.
Digestion to peptides (Trypsin, LysC, etc.).
Peptide Separation (nano-LC, capillary electrophoresis).
Mass Spectrometry Acquisition (DDA, DIA, PRM, SRM).
Data Analysis (identification, quantification, statistics).
Biological Interpretation (pathways, networks, validation).
Reporting (MIAPE compliance).
Quantitative Proteomic Approaches
Goal: measure relative or absolute abundance differences.
Label-Free Quantification (LFQ)
Spectral counting or MS1 peak area/intensity.
Pros: unlimited sample numbers; no labeling cost.
Cons: higher technical variance; requires stringent LC-MS reproducibility.
Stable Isotope Labeling by/with Amino acids in Cell culture (SILAC)
Cells grown in “light” vs “heavy” aa (e.g. ^{13}C-Lys/Arg) media.
Complete metabolic incorporation → peptides differ by known mass shift.
Ratio calculated from co-eluting isotopic pairs.
Limited to metabolically active systems.
Chemical/Isobaric Tagging
iTRAQ (4-plex, 8-plex) / TMT (6-, 11-, 16-, 18-plex) reagents.
Tags are isobaric; fragment reporter ions (≈ m/z = 113–131) in MS2 give relative quantitation.
Pros: high multiplexing; samples mixed early.
Cons: reporter ion interference (ratio compression), cost.
Absolute Quantification
AQUA peptides: spiked-in heavy synthetic peptides of known concentration → \text{[Endogenous]} = \text{ratio} \times \text{[Spike]}.
QconCAT: concatenated heavy protein standard.
Data-Independent Acquisition (DIA) / SWATH
MS cycles through sequential m/z windows, fragments all ions.
Generates comprehensive digital map; quantification via spectral libraries.
Applications in Research & Biomedicine
Cancer subtype stratification & personalized therapy selection (e.g. phosphoproteome of EGFR mutants).
Neurodegenerative disease biomarker discovery (CSF proteome in Alzheimer’s).
Infectious disease: serum proteomics for early sepsis detection.
Cardiovascular: post-MI plasma proteome to predict heart failure.
Drug mechanism of action: chemoproteomics; thermal shift (CETSA).
Large-scale projects: Human Proteome Project (HPP), CPTAC, Clinical Proteomic Tumor Analysis Consortium.
Ethical, Philosophical & Practical Considerations
Patient consent & privacy for clinical specimens.
Data sharing: PRIDE, ProteomeXchange; FAIR principles.
Reproducibility crisis: need for rigorous QC, replicates, open methods.
Cost vs benefit in low-resource settings; equity of access to MS facilities.
Environmental impact: solvent consumption, high-energy instruments.
Numerical & Statistical References
Typical mass accuracy goals: <1\,\text{ppm} (Orbitrap) or <50\,\text{ppm} (TOF).
Dynamic range of human plasma proteome ≈ 10^{10}.
Protein FDR threshold: \le 1\%; peptide FDR often \le 0.1\%.
SILAC heavy/light mass shifts: +6.0201\,\text{Da} (Arg-6), +8.0142\,\text{Da} (Lys-8).
Signal-to-noise (S/N) threshold for valid peaks commonly ≥3:1.
Coefficients of variation (CV) accepted in biological replicates: \le 20\% (LFQ), \le 10\% (isobaric).
Key Terms & Definitions
Proteotypic peptide: uniquely identifies a protein.
Post-Translational Modification (PTM): chemical change after translation (e.g. phosphorylation +79.966\,\text{Da}).
Dynamic exclusion: prevents re-sampling of abundant ions during DDA.
Isotope envelope: cluster of peaks differing by 1 Da (C13/N15) used to assign charge.
Neutral loss: characteristic loss (e.g. \text{H}3\text{PO}4) aiding PTM localization.
Connections to Foundational Principles
Central Dogma (DNA→RNA→Protein) positions proteomics downstream of genomics/transcriptomics.
Complements metabolomics in systems biology to close genotype-phenotype gap.
Builds on chemical principles of ionisation, chromatography, and electrophoresis.
Real-World Relevance
Translational pipelines from discovery to clinical LC-MS assays (e.g. multiple reaction monitoring for troponin I).
Regulatory acceptance: FDA & EMA guidance for bioanalytical MS.
Industry: quantitative proteomics in biopharmaceutical QC and lot release.