Proteomics

Munster Technological University (MTU)

Course: Proteomics - BIOL8023

Instructor: Dr. Saravana Sivagnanam
Multi Omics
  • Website: www.mtu.ie


What is the Proteome?

  • Proteins are biological molecules composed of building blocks known as amino acids.

  • Proteins are essential to life, serving a wide variety of functions including:

    • Structural

    • Metabolic

    • Transport

    • Immune response

    • Signaling and regulatory roles

  • The term "proteome" was introduced by Australian Ph.D. student Marc Wilkins during a symposium in 1994 in Siena, Italy.


What is Proteomics?

  • Proteomics is the study of the proteome, focusing on how different proteins interact with one another and their roles within the cell.

Key Concepts in Proteomics

  1. Protein Expression:

    • It is essential to recognize that mRNA expression levels do not always correlate well with protein expression levels.

    • The study of mRNA fails to account for:

      • Posttranslational modifications

      • Protein cleavage

      • Formation of complexes

      • Variant mRNA transcripts, all of which are crucial for protein function.

  2. Historical Context:

    • The first proteomic studies began in 1975 with the development of two-dimensional (2D) protein electrophoresis.


Applications of Proteomics

Case Study: Hemoglobin

  • Hemoglobin plays a crucial role in picking up oxygen in the lungs, transporting it through the blood, and delivering it to the cells.

  • Example of a disease related to protein mutation:

    • Sickle Cell Disease: Caused by a single amino acid change in the hemoglobin protein.


Tools of Proteomics

1. Protein Separation Technology
  • Simplifies complex protein mixtures and targets specific proteins for analysis.

2. Mass Spectrometry (MS)
  • Provides accurate molecular mass measurements of intact proteins and peptides.

3. Database Resources
  • Access to protein databases, expressed sequence tags (EST), and complete genome sequence databases.

4. Software Collection
  • Software is used to match MS data with specific protein sequences in databases.


Difference Between Genomics and Proteomics

Aspect

Genomics

Proteomics

Study Focus

Study of genomes and their functions

Study of proteomes and their functions

Methods

Genome sequence mapping, variant analysis

Protein sequence mapping, 3D structure modeling, protein-protein interactions

Sequencing

Utilizes Sanger sequencing and next-generation sequencing methods

Employs mass spectrometry, affinity proteomics, and protein microarrays methods

Interpretation

Indirectly suggests physiological states

Directly specifies physiological states with spatio-temporal resolution


What are Proteins?

Definition and Function

  • A protein is a macromolecule composed of one or more chains of amino acids.

  • Examples of protein functions include:

    • Catalyzing metabolic processes (e.g., pepsin, insulin)

    • Facilitating replication processes (e.g., DNA polymerase)

    • Maintaining cell structure (e.g., keratin)

    • Regulating cell signaling (e.g., hormones)

    • Enabling transport (e.g., hemoglobin)

    • Functioning in storage (e.g., ferritin)

    • Protecting through cell defense (e.g., immunoglobulins)

    • Assisting in cell movement (e.g., actin, myosin)


Structural Organization of Proteins

Proteins have several structural levels:

  1. Primary Structure

  2. Secondary Structure

  3. Tertiary Structure

  4. Quaternary Structure

  • The native structure of a protein is essential for its biological function; any loss of structure can lead to a loss of function.


Primary Structure of Proteins

Definition

  • The primary structure of a protein is a linear polypeptide chain consisting of amino acids linked by peptide bonds.

Classification of Peptides
  • Peptides:

    • Dipeptides: 2 amino acids

    • Tripeptides: 3 amino acids

    • Tetrapeptides: 4 amino acids

  • Oligopeptides: up to 20 amino acids

  • Polypeptides: 20 to 50 amino acids

  • Proteins: more than 50 amino acids

Genetic Encoding

  • The amino acid sequence is primarily dictated by the DNA sequence of the corresponding gene (genetic code).

Codon Definition
  • A codon is a sequence of three nucleotides in DNA or RNA that corresponds to a specific amino acid.

    • There are 64 possible codons formed from combinations of the four nitrogenous bases found in DNA/RNA.

    • There are 20 amino acids universally encoded by most organisms, with some amino acids being specified by more than one codon (referred to as degeneracy).

    • Each codon encodes only one specific amino acid, and these codes are universal across different organisms.


Further Codon Analysis

  • There are a total of 61 codons that code for individual amino acids, while 3 act as stop codons.

  • Example: The codon ACU codes for the amino acid Threonine.


Example Codon Tables

  • Sequence Options:

    • Option 1: CAAUGCGACCUAAGAUCUAA

    • Option 2: CAAUGCGACCUAAGAUCUAA

    • Option 3: CAAUGCGACCUAAGAUCUAA

  • Succeeding Together Analysis: Codon-table translations are crucial for protein synthesis, with each specific triplet corresponding to particular amino acids like Phenylalanine (Phe), Leucine (Leu), or stop codons.

  • Recognizing the expectations in sequencing is vital for functional proteomics.


Secondary Structure of Proteins

  • The secondary structure refers to the local folding of the polypeptide backbone into 3-D configurations.

  • Stabilized through hydrogen bonding between backbone N-H groups and C=O groups, resulting in:

    • α-helix: Formed by interactions within the same polypeptide chain.

    • β-sheets: Formed by interactions between parallel polypeptide chains.


Tertiary Structure of Proteins

  • Represents the final 3-D conformation of a protein resulting from the folding of various secondary structures.

  • Stabilization occurs through interactions:

    • Hydrophobic interactions

    • Hydrophilic interactions

    • Ionic interactions (salt bridges)

    • Disulfide bridges (cysteine residues)

    • Hydrogen bonding

  • Example: Myoglobin, primarily found in striated muscles, illustrates tertiary structure in functional protein capacity.


Amino Acid Properties

  • Amino acids are categorized based on various properties, such as hydropathy, volume, chemical properties, charge, and polarity.

  • Table Representation: Each amino acid has a unique abbreviation and characteristics that define its behavioral properties in a biological context.

    • Example properties:

    • Alanine (Ala): Hydrophobic, small, aliphatic.

    • Arginine (Arg): Hydrophilic, basic, and positively charged.


Quaternary Structure of Proteins

  • The quaternary structure describes the arrangement of multiple protein subunits.

  • Stabilization results from:

    • Hydrogen bonding

    • Van der Waals forces

    • Disulfide bridges (cysteine)

    • Example: Hemoglobin consists of four subunits.


Summary of Protein Structure

  • Primary: Sequence of amino acids in a polypeptide chain.

  • Secondary: Localized folding through hydrogen bonds (α-helices and β-sheets).

  • Tertiary: Overall 3-D shape formed by interactions between secondary structures.

  • Quaternary: Combination of multiple polypeptide chains into a single functional protein.


Translation Process

Overview

  • Translation refers to the process where mRNA is converted into a sequence of amino acids during protein synthesis, essential for all living organisms.

Mechanism

  1. The mRNA sequence directs the creation of proteins through converting genetic code (codons) to amino acid sequences.

  2. Translation takes place in the cytoplasm.

  3. Ribosomes, composed of rRNA and proteins, perform the translation process.

    • Ribosome structure:

    • Prokaryotes have 70S ribosomes (30S + 50S).

    • Eukaryotes have 80S ribosomes (40S + 60S).

  4. rRNA catalyzes the addition of amino acids through peptide bond formation.

  5. tRNA delivers appropriate amino acids to the ribosome through mRNA codon-anticodon complementarity.


Post-Translational Modifications (PTM)

Definition

  • PTMs refer to modifications that synthesized proteins undergo before final functional forms are established.

Characteristics of PTMs
  • Modifications may be irreversible or reversible.

  • Examples include:

    • Enzymatic cleavage of peptide bonds, such as insulin propeptide.

    • Addition of chemical groups to amino acid side chains (e.g., phosphorylation).

  • PTMs may occur at any point in the protein lifecycle.


Importance of Post-Translational Modifications

  • PTMs expand the coding capacity of the genome, allowing for a highly diversified proteome from the coding of DNA which typically encodes 20 primary amino acids.

  • Proteins can contain various residues due to different types of PTMs.


Functions of Specific PTMs

PTM Type

Function Example

Proteolysis

Activation

Phosphorylation

Activation

Glycosylation

Secretion

Methylation

Modulating protein function

Hydroxylation

Modulating structure

Ubiquitination

Degradation


Why Study Proteomics?

  • Protein diversity cannot be solely predicted from genetic code or gene expression studies:

    • Variants of mRNA can arise from single genes (PTM).

    • Protein abundance is not reliably predicted by mRNA levels due to unknown translation rates and degradation.

    • PTMs cannot be detected from mRNA.

    • Protein localization and interactions are not predictable from mRNA data.

  • Conclusion: Proteomics offers techniques to capture protein diversity effectively.


Proteomics Techniques

Proteomics encompasses various technologies for identifying and quantifying proteins present in a specific cell, tissue, or organism:

  1. Separation

  2. Identification

  3. Quantification

  4. Functional Analysis

  5. Structural Analysis

Considerations in Studying the Proteome
Sample Preparation
  • Precise conditions needed:

    • Cold environments, protease inhibitors, organelle isolation (approximately).

Proteomic Workflow

  1. Pre-preparation Steps

  2. Sample Separation:

    • 1D and 2D gel electrophoresis

    • Reverse Phase HPLC

    • Strong cation exchange (SCX HPLC)

  3. Structure or Mass Information via Mass Spectrometry:

  4. Protein Identification through database searching using tools like SWISS-PROT, TrEMBL, RefSeq XPs, and Ensembl.


Protein Separation Techniques

  1. Need for Separation: Since the proteome consists of complex mixtures of proteins, separation is critical for identification and characterization.

  2. Methods for Separation:

    • 1-D gel electrophoresis (based on molecular mass, e.g., SDS-PAGE)

    • 2-D gel electrophoresis (based on net charge and mass, e.g., Isoelectric focusing + SDS-PAGE)

    • Liquid chromatography (LC, separating based on interactions with liquid and stationary phases)

2-D Gel Electrophoresis
  • Process:

    • Proteins will migrate to their isoelectric point through an electric gradient in isoelectric focusing, then separated by mass using SDS-PAGE.

  • 2D-DIGE: Fluorescent protein labeling allows multiple samples to co-electrophorese on one gel for enhanced analysis.


Fluorescence Spectroscopy

  • Configuration:

    • Utilizes a xenon lamp, monochromator, and a lens to detect emitted photon signals from the sample.

    • Critical in detecting light emission post-excitation.


Chromatography Techniques

  1. Gel Filtration: Separates proteins based solely on molecular size.

  2. Hydrophobic Interaction Chromatography (HIC): Separates proteins based on their hydrophobic interactions with ligands.

  3. Ion Exchange (IE): Separates proteins based on net charge.

  4. Reverse Phase Chromatography: Relies on hydrophobic interactions between molecules in mobile and stationary phases.

  5. Affinity Chromatography: Utilizes specific binding interactions between an immobilized ligand and its target binding partner.


Protein Identification Techniques

  • Following separation, identification is necessary through:

  1. Immunoassays:

    • Based on specific antibody reactions (ELISA, Western blotting, protein microarrays).

  2. Mass Spectrometry (MS):

    • Measures mass-to-charge ratios of ions for detection.

    • Can quantify proteins and detect interactions and PTMs.

    • High-throughput and suited for proteome characterization.

Western Blotting
  • Detects specific proteins in mixtures using antibodies.

  • Capable of monitoring expression changes and PTMs.

  • Can quantify proteins and be combined with SDS-PAGE.

Protein Microarrays
  • Enables simultaneous detection of numerous proteins using antibodies and labeled probes.

  • Very sensitive, can quantify and analyze PTM interactions.


Mass Spectrometry in Protein Analysis

  • MS measures mass-to-charge ratios of ions to create spectra specific for protein identification.

  • Highly accurate and sensitive; can combine with LC for automation.

  • Tandem MS (MS/MS): Increases sample resolution.

  • Key ionization techniques:

    • Electrospray Ionization (ESI)

    • Matrix-Assisted Laser Desorption/Ionization (MALDI)


Example Applications of Mass Spectrometry

  • Rapid and affordable screening of blood abnormalities such as haemoglobinopathies and pre-diabetes, demonstrating value in clinical laboratories.


Protein Quantification Techniques

  1. ICAT (Isotope Coded Affinity Tag):

    • Identifies and quantifies protein mixtures using chemical labels that label cysteine residues.

    • Enables analysis of low-abundance proteins, direct testing of mixed samples, allows comparison of protein expression changes under different conditions.

  2. Limitations of ICAT:

    • Database limitations and specific labeling constraints (only cysteine).

    • Potential errors in quantification.

  3. Proteome Analysis Approaches:

    • Top-Down: Intact proteins analyzed for isoforms and PTMs.

    • Bottom-Up (Shotgun): Proteins digested into peptides for analysis.


Functional Proteomics

  • Driven by genome sequencing efforts and aims to:

    • Determine biological functions of unknown proteins.

    • Investigate cellular mechanisms and signaling pathways.


Structural Proteomics

  • Determines the three-dimensional structure of proteins, crucial for understanding biochemical function and protein interaction mechanisms.

  • Techniques employed include:

    • X-ray Crystallography

    • Nuclear Magnetic Resonance (NMR) Spectroscopy

X-ray Crystallography
  • Determines atomic and molecular structure at nanometer resolution, providing visualization strengths for protein structures.

NMR Spectroscopy
  • Studies molecular interactions via radiofrequency electromagnetic radiation in strong magnetic fields.

    • Effective for protein sizes up to 350 amino acids without crystallization necessity.


Summary of Proteomic Technologies

  • Purification: Chromatographic techniques

  • Analysis: ELISA, Western blotting, protein microarray

  • Characterization: Gel-based approaches, mass spectrometry

  • Structural Analysis: X-ray crystallography, NMR spectroscopy

  • Quantification: ICAT, SILAC, iTRAQ

  • Bioinformatics Analysis requires integration of technologies for comprehensive data management.


Proteome Database Utilization

  • Public databases established for storing large volumes of data generated in studies; providing access such as:

    • GenBank: Protein sequence database.

    • RefSeq: Protein sequence database.

    • UniProt: Functional information database.

    • CATH: Evolution and categorization of proteins.