KS

Proteomics and Mass Spectrometry – Core Vocabulary

Defining the Proteome

  • The proteome = the entire set of proteins expressed by a cell, tissue, organ, organism, or biological system at a given time.

    • Dynamic; changes with development, environmental cues, disease states.

    • Complements the genome: while the genome is relatively constant, the proteome reflects real-time cellular state and function.

  • Hierarchical views

    • Core (constitutive) proteome

    • Conditional proteome (stimulus/disease-specific)

    • Post-translationally modified (PTM) sub-proteomes (phospho-, glyco-, acetyl-, ubiquitin-omes, etc.)

Importance of Proteomics in Biomedical Sciences

  • Fills the gap between genotype and phenotype.

  • Identifies biomarkers for diagnosis/prognosis (e.g. \text{PSA} in prostate cancer, \text{HbA}_1c in diabetes).

  • Maps drug targets & off-target effects (precision medicine).

  • Deciphers signaling pathways and networks → system-level understanding.

  • Monitors therapeutic response and resistance mechanisms in real time.

  • Enables discovery of PTMs that regulate protein activity.

  • Supports development of vaccines and anti-microbial strategies by mapping antigenic proteins.

Experimental Strategies for Proteome Analysis

  • Broad categories: gel-based and gel-free (MS-based) methods.

Two-Dimensional Gel Electrophoresis (2D-GE)

  • Separation principle:

    1. First dimension (IEF): proteins separated by isoelectric point pI (where net charge = 0). pI = \frac{pK{a1} + pK{a2}}{2} for simple ampholytes.

    2. Second dimension (SDS-PAGE): separation by molecular weight (M_w).

  • Visualisation: Coomassie, silver, or fluorescent stains.

  • Pros: high resolving power, can detect PTM-induced shifts.

  • Cons: low throughput, hydrophobic & very large/small proteins under-represented, labor-intensive spot excision → MS.

Two-Dimensional Difference Gel Electrophoresis (2D-DIGE)

  • Samples pre-labeled with spectrally distinct fluorescent dyes (e.g. Cy2, Cy3, Cy5).

  • Mixed & run on one gel → reduces gel-to-gel variability.

  • Differential expression calculated from fluorescence intensity ratios.

  • Sensitivity down to low-nanogram range; multiplexing (up to 3-4 samples per gel).

Mass Spectrometry (MS)

  • Central analytical tool for modern proteomics. Measures mass-to-charge ratio m/z
    \frac{m}{z} = \frac{M + nH}{n} (where M = neutral mass, n = charge state, H = proton mass ≈ 1.00728 Da).

  • Advantages: high sensitivity (zeptomole-attomole), high mass accuracy (<1 ppm in Orbitraps), amenable to complex mixtures.

  • Can identify, quantify, and characterize PTMs.

How Mass Spectrometry Works (Context of Proteomics)

  • Ionisation

    • Electrospray Ionisation (ESI): produces multiply charged ions, compatible with LC-MS online coupling.

    • Matrix-Assisted Laser Desorption/Ionisation (MALDI): mainly singly charged ions; plate-based.

  • Mass Analysers

    • Time-of-Flight (TOF): t = \frac{L}{\sqrt{2zV/m}} (flight time proportional to \sqrt{m/z}).

    • Quadrupole: mass filtering via oscillating electric fields.

    • Orbitrap: harmonic electrostatic trap; frequency f \propto 1/\sqrt{m/z}.

    • Ion Trap (3D or linear), FT-ICR, Q-TOF hybrids.

  • Detectors: electron multiplier, image current detection (Orbitrap/FT-ICR).

  • Tandem MS (MS/MS): isolates precursor ion, fragments (CID, HCD, ETD), records fragment spectrum → sequence inference.

  • Database search algorithms: Mascot, Sequest, Andromeda; match experimental MS/MS to in-silico digests.

  • False Discovery Rate (FDR) control: target-decoy strategy (<1 % typical).

Proteomic Workflow Overview (Lecture 2)

  1. Biological Question & Experimental Design.

  2. Sample Collection & Lysis.

  3. Protein Extraction & Cleanup.

  4. Digestion to peptides (Trypsin, LysC, etc.).

  5. Peptide Separation (nano-LC, capillary electrophoresis).

  6. Mass Spectrometry Acquisition (DDA, DIA, PRM, SRM).

  7. Data Analysis (identification, quantification, statistics).

  8. Biological Interpretation (pathways, networks, validation).

  9. Reporting (MIAPE compliance).

Quantitative Proteomic Approaches

  • Goal: measure relative or absolute abundance differences.

Label-Free Quantification (LFQ)

  • Spectral counting or MS1 peak area/intensity.

  • Pros: unlimited sample numbers; no labeling cost.

  • Cons: higher technical variance; requires stringent LC-MS reproducibility.

Stable Isotope Labeling by/with Amino acids in Cell culture (SILAC)

  • Cells grown in “light” vs “heavy” aa (e.g. ^{13}C-Lys/Arg) media.

  • Complete metabolic incorporation → peptides differ by known mass shift.

  • Ratio calculated from co-eluting isotopic pairs.

  • Limited to metabolically active systems.

Chemical/Isobaric Tagging

  • iTRAQ (4-plex, 8-plex) / TMT (6-, 11-, 16-, 18-plex) reagents.

  • Tags are isobaric; fragment reporter ions (≈ m/z = 113–131) in MS2 give relative quantitation.

  • Pros: high multiplexing; samples mixed early.

  • Cons: reporter ion interference (ratio compression), cost.

Absolute Quantification

  • AQUA peptides: spiked-in heavy synthetic peptides of known concentration → \text{[Endogenous]} = \text{ratio} \times \text{[Spike]}.

  • QconCAT: concatenated heavy protein standard.

Data-Independent Acquisition (DIA) / SWATH

  • MS cycles through sequential m/z windows, fragments all ions.

  • Generates comprehensive digital map; quantification via spectral libraries.

Applications in Research & Biomedicine

  • Cancer subtype stratification & personalized therapy selection (e.g. phosphoproteome of EGFR mutants).

  • Neurodegenerative disease biomarker discovery (CSF proteome in Alzheimer’s).

  • Infectious disease: serum proteomics for early sepsis detection.

  • Cardiovascular: post-MI plasma proteome to predict heart failure.

  • Drug mechanism of action: chemoproteomics; thermal shift (CETSA).

  • Large-scale projects: Human Proteome Project (HPP), CPTAC, Clinical Proteomic Tumor Analysis Consortium.

Ethical, Philosophical & Practical Considerations

  • Patient consent & privacy for clinical specimens.

  • Data sharing: PRIDE, ProteomeXchange; FAIR principles.

  • Reproducibility crisis: need for rigorous QC, replicates, open methods.

  • Cost vs benefit in low-resource settings; equity of access to MS facilities.

  • Environmental impact: solvent consumption, high-energy instruments.

Numerical & Statistical References

  • Typical mass accuracy goals: <1\,\text{ppm} (Orbitrap) or <50\,\text{ppm} (TOF).

  • Dynamic range of human plasma proteome ≈ 10^{10}.

  • Protein FDR threshold: \le 1\%; peptide FDR often \le 0.1\%.

  • SILAC heavy/light mass shifts: +6.0201\,\text{Da} (Arg-6), +8.0142\,\text{Da} (Lys-8).

  • Signal-to-noise (S/N) threshold for valid peaks commonly ≥3:1.

  • Coefficients of variation (CV) accepted in biological replicates: \le 20\% (LFQ), \le 10\% (isobaric).

Key Terms & Definitions

  • Proteotypic peptide: uniquely identifies a protein.

  • Post-Translational Modification (PTM): chemical change after translation (e.g. phosphorylation +79.966\,\text{Da}).

  • Dynamic exclusion: prevents re-sampling of abundant ions during DDA.

  • Isotope envelope: cluster of peaks differing by 1 Da (C13/N15) used to assign charge.

  • Neutral loss: characteristic loss (e.g. \text{H}3\text{PO}4) aiding PTM localization.

Connections to Foundational Principles

  • Central Dogma (DNA→RNA→Protein) positions proteomics downstream of genomics/transcriptomics.

  • Complements metabolomics in systems biology to close genotype-phenotype gap.

  • Builds on chemical principles of ionisation, chromatography, and electrophoresis.

Real-World Relevance

  • Translational pipelines from discovery to clinical LC-MS assays (e.g. multiple reaction monitoring for troponin I).

  • Regulatory acceptance: FDA & EMA guidance for bioanalytical MS.

  • Industry: quantitative proteomics in biopharmaceutical QC and lot release.