Protein Structure and Function

PROTEIN STRUCTURE + FUNCTION

Overview of Proteins

Proteins constitute approximately 50% of the dry weight of cells.
They are the most abundant macromolecules in cellular structures.
Functions of proteins include:
- Regulation
- Defense
- Catalysts
- Transporters

Structure of Proteins

The structure of proteins is determined by their amino acid sequences, which is the first level of protein structure.
Proteins can have up to four levels of structure:
1. Primary (1°) Structure:
- Refers to the sequence of amino acid (AA) residues linked by amide/peptide bonds.
- All or most of the 20 naturally occurring amino acids are typically present.
- The sequence is critical as it determines the higher order of protein structure.
- Peptide bonds form through dehydration condensation:
1. Secondary (2°) Structure:
- Stabilized by hydrogen bonds between the backbone atoms.
- Common forms include:
  - Alpha-helix
  - Beta-pleated sheet
1. Tertiary (3°) Structure:
- The overall three-dimensional shape formed from interactions of R groups and side chains.
- Involves:
  - Hydrogen bonding
  - Ionic properties
  - Hydrophobic interactions
  - Disulfide bonds (between cysteine residues)
1. Quaternary (4°) Structure:
- Involves the assembly of multiple polypeptide chains into a larger complex.
- Example: Hemoglobin consists of four subunits (oligomeric protein).

Determining Amino Acid Sequence of Proteins

Steps to determine the amino acid composition and sequence include:
1. Step 1: Analyze Composition
- Utilize processes like hydrolysis (heating at 110°C for 10-24 hours) to break down proteins into individual amino acids.
1. Step 2: End Group Analysis
- Various methods like Sanger's method and Edman degradation are used to determine the N-terminal and C-terminal residues:
  - Sanger’s method: Utilizes 1-fluoro-2,4-dinitrobenzene (FDNB) for N-terminal identification.
  - Edman degradation: Involves sequential removal of amino acids starting from the N-terminus.
  - Enzymatic methods using enzymes like exopeptidases and endopeptidases are often used for larger proteins.
1. Step 3: Use of Endopeptidases
- These enzymes cleave peptide bonds at specific sites within the polypeptide (e.g., trypsin cleaves C-terminal to Lys or Arg).

Types of Protein Structures

Secondary Structures

Alpha-helix:
- Right-handed helix observed in many proteins, with 3.6 AA residues per turn, stabilized by hydrogen bonds.
- The alignment causes R groups to protrude outward from the helical backbone.
Beta-pleated sheet:
- Consists of beta strands aligned next to each other stabilized by hydrogen bonds. Can be
- Antiparallel: aligned in opposite directions
- Parallel: aligned in the same direction.
Importance of specific amino acids (e.g., proline induces kinks).

Tertiary Structure

The tertiary structure is affected by the hydrophobic effect, salt bridges, and hydrogen bonding between distant parts of the polypeptide chain.

Quaternary Structure

Multi-chain proteins have various subunit arrangements.

Protein Denaturation

Denaturation is the loss of native conformation resulting in loss of biological function.
Causes of denaturation include:
- Heat (over 60-70 °C)
- pH changes
- Chemical agents (urea, detergents, etc.)

Anfinsen’s Experiment

Christian Anfinsen's experiments with ribonuclease (RNase) demonstrated the capability of proteins to refold into their functional form when denaturing agents were removed.
Revealed the importance of the primary structure in determining the three-dimensional structure of proteins.

Functional Implications

The structure of proteins dictates their function. For instance:
- Myoglobin is a globular protein that binds oxygen, while collagen is a fibrous protein providing structural support.
Changes in amino acid sequence or structure can lead to functional changes or loss of functionality (as seen in various diseases).

PROTEIN STRUCTURE + FUNCTION

Overview of Proteins

Proteins constitute approximately 50% of the dry weight of cells.
They are the most abundant macromolecules in cellular structures.
Functions of proteins include:
- Regulation: Enzymes regulating metabolic pathways, hormones (e.g., insulin).
- Defense: Antibodies (immunoglobulins) recognizing foreign invaders.
- Catalysts: Enzymes (e.g., amylase, pepsin) accelerating biochemical reactions without being consumed.
- Transporters: Moving molecules across membranes (e.g., hemoglobin for oxygen, membrane channels).
- Structural Support: Providing framework (e.g., collagen in connective tissues, keratin in hair and nails).
- Movement: Muscle contraction (e.g., actin, myosin).
- Signaling: Receptors detecting signals (e.g., G-protein coupled receptors).

Structure of Proteins

The structure of proteins is intrinsically determined by their amino acid sequences, which represents the fundamental primary (1°) structure. Disruption of this sequence often leads to altered or lost function.
Proteins can have up to four levels of structural organization:
1. Primary (1°) Structure:
- Refers to the linear sequence of amino acid (AA) residues linked covalently by amide/peptide bonds. This sequence is read from the N-terminus (amino end) to the C-terminus (carboxyl end).
- All or most of the 20 naturally occurring proteinogenic amino acids are typically present, each contributing unique side chain properties.
- The specific sequence is paramount as it dictates the subsequent higher-order protein structures and ultimately its biological function. A single amino acid change can have profound effects (e.g., sickle cell anemia).
- Peptide bonds form through a dehydration condensation reaction between the carboxyl group of one amino acid and the amino group of another, releasing a water molecule:
 
 $\text{R}{1}\text{-CH(NH}{2})\text{-COOH} + \text{R}{2}\text{-CH(NH}{2})\text{-COOH} \longrightarrow \text{R}{1}\text{-CH(NH}{2})\text{-CO-NH-CH(R}{2})\text{-COOH} + \text{H}{2}\text{O}$
- Peptide bonds have partial double-bond character, making them rigid and planar.
1. Secondary (2°) Structure:
- Involves the local folding of the polypeptide chain into specific conformations, primarily stabilized by hydrogen bonds between the backbone amide (N-H) and carbonyl (C=O) groups of non-adjacent amino acids.
- Common forms include:
 - Alpha-helix ( $\alpha$ -helix): A right-handed coiled structure where the polypeptide backbone twists. It is stabilized by hydrogen bonds formed between the carbonyl oxygen of one amino acid and the amide hydrogen of an amino acid four residues away ( $n$ and $n+4$ ). There are approximately 3.6 amino acid residues per turn. R-groups protrude outward from the helical backbone, minimizing steric hindrance.
 - Beta-pleated sheet ( $\beta$ -sheet): Consists of multiple beta strands (extended polypeptide segments) lying side-by-side, stabilized by hydrogen bonds between backbone atoms of adjacent strands. Can be antiparallel (strands run in opposite N-to-C directions, forming stronger, more linear H-bonds) or parallel (strands run in the same N-to-C direction, forming weaker, distorted H-bonds).
 - Beta-turns ( $\beta$ -turns): Short, sharp turns connecting strands in $\beta$ -sheets, particularly common in antiparallel sheets. Often involve 4 residues and a hydrogen bond between the carbonyl oxygen of residue $i$ and the amide hydrogen of residue $i+3$ . Proline is often found in $\beta$ -turns due to its rigid ring structure, which introduces a kink.
1. Tertiary (3°) Structure:
- This is the overall three-dimensional globular shape of a single polypeptide chain, formed by the cumulative interactions between the R-groups (side chains) of the amino acids.
- These interactions include:
 - Hydrogen bonding: Between polar side chains.
 - Ionic interactions (salt bridges): Between oppositely charged acidic and basic R-groups (e.g., Lys and Asp).
 - Hydrophobic interactions: Nonpolar R-groups tend to cluster together in the interior of the protein, away from the aqueous environment, driven by the hydrophobic effect.
 - Van der Waals forces: Weak, transient attractive forces between all atoms, especially significant when many atoms are close together.
 - Disulfide bonds: Strong covalent bonds formed between the sulfhydryl (-SH) groups of two cysteine residues, creating a stable cross-link (S-S). This is a crucial covalent stabilization mechanism unique to tertiary and quaternary structures.
- Often assisted by molecular chaperones (heat shock proteins) during folding.
1. Quaternary (4°) Structure:
- Describes the arrangement and interaction of multiple polypeptide chains (subunits) to form a larger, functional multi-subunit protein complex. Proteins exhibiting this level of structure are called oligomeric proteins.
- Interactions between subunits are similar to those in tertiary structure: hydrogen bonds, ionic bonds, hydrophobic interactions, and sometimes disulfide bonds.
- Example: Hemoglobin consists of four polypeptide subunits (two $\alpha$ and two $\beta$ chains), which cooperate to bind oxygen efficiently. Other examples include antibodies and viral capsids.

Determining Amino Acid Sequence of Proteins

Accurate determination of the amino acid sequence (primary structure) is fundamental for understanding protein function, evolution, and disease.
Steps commonly involve:
1. Step 1: Analyze Amino Acid Composition
- The protein is first subjected to acid hydrolysis (e.g., heating with 6M HCl at 110°C for 10-24 hours) to break all peptide bonds and yield a mixture of free amino acids.
- The individual amino acids are then separated and quantified using techniques like ion-exchange chromatography or high-performance liquid chromatography (HPLC), often followed by derivatization with a chromogenic or fluorogenic reagent (e.g., ninhydrin). This gives the total count of each amino acid but not their order.
1. Step 2: End Group Analysis (N-terminal and C-terminal residue identification)
- Various chemical and enzymatic methods are used to identify the amino acid residues at the ends of the polypeptide chain:
  - Sanger’s method: Uses 1-fluoro-2,4-dinitrobenzene (FDNB), also known as Sanger's reagent. FDNB reacts specifically with the free $\alpha$ -amino group of the N-terminal residue to form a stable dinitrophenyl (DNP)-derivative. After subsequent acid hydrolysis, the DNP-amino acid can be identified (e.g., by chromatography), while other amino acids are released as free AAs. This method determines one N-terminal residue per polypeptide chain.
  - Edman degradation: A more commonly used sequential method for determining the N-terminal sequence of a polypeptide. It involves reacting the N-terminal amino acid with phenylisothiocyanate (PITC) to form a phenylthiocarbamoyl (PTC) adduct. This adduct is then selectively cleaved from the polypeptide under mild acidic conditions, releasing the N-terminal amino acid as a phenylthiohydantoin (PTH) derivative. This PTH-amino acid can be identified, and the process can be repeated on the shortened polypeptide chain, allowing for sequencing of up to 50-60 residues from the N-terminus.
  - Enzymatic methods: Aminopeptidases (a type of exopeptidase) can sequentially remove amino acids from the N-terminus, while carboxypeptidases remove them from the C-terminus. By monitoring the release of amino acids over time, the sequence of a few terminal residues can be inferred.
1. Step 3: Fragmentation of Large Proteins and Overlapping Sequences
- For larger proteins, direct sequencing of the entire chain is impractical. Therefore, the protein is typically cleaved into smaller, more manageable peptide fragments using specific proteolytic enzymes (endopeptidases) or chemical reagents.
- Endopeptidases: These enzymes cleave peptide bonds at specific recognition sites within the polypeptide chain. Examples include:
  - Trypsin: Cleaves on the C-terminal side of Lysine (Lys) and Arginine (Arg) residues.
  - Chymotrypsin: Cleaves on the C-terminal side of large hydrophobic amino acids like Phenylalanine (Phe), Tryptophan (Trp), and Tyrosine (Tyr).
  - Pepsin: Cleaves on the N-terminal side of hydrophobic residues (less specific than trypsin or chymotrypsin).
  - V8 protease: Cleaves on the C-terminal side of Glutamate (Glu) and Aspartate (Asp) (phosphate buffer) or only Glu (ammonium bicarbonate buffer).
- Chemical cleavage: Cyanogen bromide (BrCN) specifically cleaves peptide bonds on the C-terminal side of methionine (Met) residues.
- To sequence the entire protein, overlapping peptide fragments are generated using at least two different cleavage methods. The sequences of these smaller fragments are then determined (e.g., by Edman degradation), and the overlapping information is used to reconstruct the full amino acid sequence of the original protein.

Protein Denaturation

Denaturation is the process where a protein loses its native three-dimensional conformation (secondary, tertiary, and quaternary structures, but usually not primary) due to disruption of non-covalent interactions and disulfide bonds, leading to a loss of biological function.
Causes of denaturation include:
- Heat (over 60-70 °C): Increases kinetic energy, disrupting weak interactions like hydrogen bonds and hydrophobic interactions. Excessive heat can cause irreversible aggregation.
- pH changes: Alter the ionization states of amino acid side chains (e.g., Lys, Arg, His, Asp, Glu, Cys, Tyr), disrupting ionic interactions (salt bridges) and hydrogen bonds, leading to changes in charge distribution and protein conformation.
- Chemical agents:
  - Chaotropic agents (e.g., urea, guanidinium chloride): Disrupt hydrogen bonds and hydrophobic interactions by interfering with the water structure around the protein.
  - Detergents (e.g., SDS - sodium dodecyl sulfate): Interact with hydrophobic regions of proteins, causing unfolding and aggregation, often irreversibly. SDS also imparts a uniform negative charge, overriding native charges.
  - Reducing agents (e.g., $\beta$ -mercaptoethanol, dithiothreitol - DTT): Break disulfide bonds by reducing the S-S bridges back to two -SH groups.
- Heavy metals (e.g., lead, mercury): Bind to sulfhydryl groups, disrupting disulfide bonds and sometimes forming new, unnatural cross-links.
- Mechanical stress: Vigorous shaking or stirring can disrupt weak forces, causing denaturation (e.g., whipping egg whites).
Denaturation can be reversible (renaturation, if denaturing agent is removed and primary structure is intact) or irreversible (e.g., cooking an egg).

Anfinsen’s Experiment

Christian Anfinsen's groundbreaking experiments with the enzyme ribonuclease A (RNase A) in the 1950s provided crucial insights into protein folding.
He denatured RNase A (which has four disulfide bonds and a specific 3D structure necessary for its enzymatic activity) by treating it with:
- Urea: A strong chaotropic agent that disrupts non-covalent interactions (hydrogen bonds, hydrophobic interactions).
- Beta-mercaptoethanol ( $\beta$ -ME): A reducing agent that breaks disulfide bonds.
Upon complete denaturation, RNase A lost all its enzymatic activity and adopted a random coil conformation.
Key finding: When both urea and $\beta$ -ME were removed sequentially (first $\beta$ -ME, then urea, allowing disulfide bonds to reform later in a specific environment), the denatured RNase A spontaneously refolded into its original native, biologically active conformation, and its enzymatic activity was fully restored.
This experiment famously demonstrated the Anfinsen's dogma or thermodynamic hypothesis of protein folding: "The native three-dimensional structure of a protein is determined solely by its amino acid sequence" (primary structure). It implied that all the information required for proper folding is encoded within the primary sequence, and the folded state is the thermodynamically most stable conformation.

Functional Implications

The precise three-dimensional structure of a protein is absolutely essential for its biological function. Any alteration in structure, even subtle changes, can lead to a loss or modification of function.
For instance:
- Myoglobin is a globular protein found in muscle tissue, specialized for singleton oxygen storage, having a compact, soluble structure enabling its function.
- Collagen is a fibrous protein and a major component of connective tissues, forming long, strong fibers that provide structural support and tensile strength to skin, tendons, and bones. Its triple-helix structure is critical for this role.
- Enzymes have specific active sites that perfectly fit their substrates, allowing them to catalyze specific reactions.
- Antibodies have variable regions with unique binding sites to recognize specific antigens.
Changes in amino acid sequence (e.g., point mutations) or modifications to the protein structure can lead to functional changes or complete loss of functionality, often observed in various human diseases (e.g., sickle cell anemia, where a single amino acid substitution in hemoglobin causes red blood cells to deform; Alzheimer's disease and Parkinson's disease, where misfolded proteins aggregate and cause neurotoxicity). Understanding protein structure-function relationships is critical for drug design and disease treatment.