Liu3304_fall2023_lecture_6

Importance of Protein Sequences

  • Structural studies rely heavily on sequence information to understand protein function and architecture, as the specific arrangement of amino acids greatly influences protein capabilities.

  • By comparing sequences among various species, researchers can deduce potential functions and trace evolutionary lineage among different proteins, revealing how proteins have adapted over time to perform specialized functions in diverse biological settings.

  • Many diseases, such as sickle cell anemia and cystic fibrosis, arise from just one or two mutations in a protein’s sequence, highlighting the importance of precise sequencing technologies in diagnosing and understanding genetic disorders.

Steps in Protein Sequencing

Understanding the Steps Required for Protein Sequencing

  1. Identification of Polypeptide Chains: Determine the number of polypeptide chains (subunits) present in the protein, which is vital for understanding the protein's overall structure.

  2. Cleavage of Bonds: Cleave disulfide bonds to separate inter- and intrachain connections, which is crucial for accurate structural analysis.

  3. Amino Acid Composition Analysis: Analyze the amino acid composition of each polypeptide chain to understand the protein's characteristics and functions.

  4. Fragmentation: If chains are extensive, they should be fragmented into shorter chains (about 50 amino acids long) to facilitate sequencing.

  5. Sequencing Fragments: Use the Edman degradation method to sequence each fragment, which allows detailed examination of the primary structure.

  6. Overlap Completion: Complete the overall sequence by analyzing overlaps in two sets of fragments, ensuring coherence in sequence determination.

Primary Structure of Bovine Insulin

  • The primary structure of bovine insulin consists of two peptide chains (A and B chains) connected by disulfide bonds, underscoring the importance of disulfide linkages in stabilizing the protein's three-dimensional conformation.

  • Recognizing these bonds is crucial for identifying protein structure and function since alterations can lead to a loss of biological activity.

Historical Context of Protein Sequencing

  • Fred Sanger was awarded the Nobel Prize in 1958 for his pioneering work in protein sequencing, particularly insulin, which laid the foundation for modern sequencing techniques.

  • Initially, sequencing this protein took a decade, required a significant amount of protein (100 g), and involved substantial manpower, posing a challenge to researchers.

  • Modern techniques, such as mass spectrometry, have drastically reduced the necessary amount of protein to just a few micrograms and cut down the time required to mere days, making the process more accessible and efficient.

Chemical Protein Sequencing Process

Detailed Breakdown of Steps

  • Identification of Polypeptide Chains: Assessing the number of polypeptide chains is essential for accurate sequencing and understanding protein interactions.

  • Cleavage of Bonds: Specific reagents disrupt disulfide bonds, essential for separating polypeptide chains and enabling detailed analysis of primary structures.

  • Amino Acid Composition: An essential step to identify the function of proteins based on their amino acid profiles, which also sheds light on potential biological roles.

  • Fragmentation: Longer chains are fragmented to manageable lengths for sequencing, thereby enhancing resolution of overlapping sequences during analysis.

  • Sequencing Fragments: The Edman degradation method, involving incremental removal of amino acids from the N-terminus, provides detailed insight into the protein’s sequence, albeit with potential error margins with longer sequences.

  • Overlap Completion: Integrating the sequences from fragmented chains ensures constructing a comprehensive profile of the protein’s structure.

Determining Peptide Chains in Proteins

  • The peptide chain can be identified by locating end groups in the sequence, particularly looking for N-terminal and C-terminal ends. This step is crucial as it lays the groundwork for understanding the protein's functional potential.

Example Peptide Sequences

  • A Chain: Gly-Ile-Val-Glu-Gln-Cys-Cys-Ala-Ser-Val-Cys-Ser-Leu-Tyr-Gln-Leu-Glu-Asn-Tyr-Cys-Asn

  • B Chain: Phe-Val-Asn-Gln-His-Leu-Cys-Gly-Ser-His-Leu-Val-Glu-Ala-Leu-Tyr-Leu-Val-Cys-Gly-Glu-Arg-Gly-Phe-Phe-Tyr-Thr-Pro-Lys-Ala

  • Using 1-Dimethylaminonaphthalene-5-sulfonyl chloride (Dansyl chloride) can help identify N-terminal residues based on its fluorescence properties, adding another layer to structural characterization.

Enzymatic Cleavage of Peptides

  • Carboxypeptidase A and B: These enzymes sequentially cleave residues from the C-terminus, revealing insights into the peptide sequence's terminal structures. Carboxypeptidase A can work on residues like Arg, Lys, and Pro, whereas Carboxypeptidase B primarily acts on similar residues but varies slightly in specificity.

  • Endopeptidases vs Exopeptidases: Endopeptidases cleave peptide bonds at specific internal sites, providing more targeted cleavage, while exopeptidases act at the terminal ends by gradually removing amino acids from either end, essential for detailed sequence unraveling.

Cleavage of Disulfide Bonds

  • Disulfide bond cleavage is fundamental to separating polypeptide chains and preventing refolding that could mask important structural information. Accurate methods are crucial for retaining biological relevance.

  • Reagents like performic acid oxidation or 2-Mercaptoethanol (2-ME) are commonly used to break these bonds, turning cysteine into cysteic acid or maintaining them in a reduced form, allowing for better structural assessments.

Amino Acid Composition Analysis

  • The overall amino acid profile is obtained through complete hydrolysis of the peptide chains, followed by a quantitative analysis of the liberated amino acids to transcend insights on protein functionality.

  • Different hydrolysis conditions are used to assure optimal recovery of amino acids.

    • Acid Hydrolysis (6 N HCl with phenol): High temperature can damage certain amino acids like tryptophan (Trp) and partially degrade serine, threonine, and tyrosine.

    • Base Hydrolysis (2 to 4 N NaOH): Similar cautions apply, as some residues remain unaffected while others may degrade during the process.

Amino Acid Analyzers and Techniques

  • To quantitate amino acids post-hydrolysis, derivatization is an integral step to maximize recovery of amino acids for analysis, ensuring high-quality results.

  • Automated setups using HPLC separate the derivatized amino acids based on known elution times for efficient quantification of amino acid composition.

Significance of Amino Acid Composition

  • Common amino acids in proteins include leucine, alanine, glycine, serine, valine, glutamic acid, and isoleucine. In contrast, histidine, methionine, cysteine, and tryptophan are less represented and often used as indicators of protein functionalities.

  • The polar to non-polar ratios defined within a protein may indicate whether the protein is globular or membrane-spanning. Additionally, proteins like collagen consist of repetitive peptide structures, which is essential for structural integrity in biological systems.

Final Steps in Chemical Protein Sequencing

  • Reiterating the steps confirms that the identification of polypeptide chains, cleavage of disulfide bonds, amino acid composition determination, fragmentation, sequencing using the Edman degradation method, and overlap analysis culminate in completing the protein sequence study, paving the path for advanced research in biochemistry and molecular biology.

Proteases and Protein Fragmentation

  • Proteases like trypsin and chymotrypsin specifically cleave peptide bonds at defined sequences, allowing for the fragmentation of proteins into manageable pieces for the sequencing process.

  • For instance:

    • Trypsin cleaves at residues like lysine (K) and arginine (R), critical for precise mapping of protein sequences.

    • Chymotrypsin targets large hydrophobic residues (Phe, Trp, Tyr), showcasing the diversity of enzyme specificity in peptide bond cleavage, emphasizing the importance of recognizing substrate specificity in enzyme studies.

Specific Chemical Cleavage Techniques

  • Different reagents are utilized during protein cleavage, indicating a need for tailored approaches to successfully fragment proteins based on their structural nuances.

  • Unique patterns formed during sequencing allow researchers to align fragments by overlapping sequences, leading to the construction of the complete peptide chain and increasing understanding of protein functionality.

Edman Degradation Methodology

Process Overview

  • The Edman degradation technique employs phenyl isothiocyanate (PITC) to sequentially remove amino acids from the N-terminus, allowing for identification through various analytical methods such as Thin Layer Chromatography (TLC) and High-Performance Liquid Chromatography (HPLC).

  • Automation of Edman degradation has significantly enhanced throughput for protein sequencing, although errors can occur with longer sequences due to the limitations of biochemical efficiency in the process, reinforcing the need for continual improvements in sequencing methodologies.

Concluding Remarks on Protein Sequencing

  • The critical step of identifying disulfide bonds plays a fundamental role in successful protein sequencing, serving as a common task in the broader context of protein studies, emphasizing the need for precise techniques to yield accurate results for biological research.

robot