p2

Introduction to Biochemistry Proteins

Course Code: BCHM 2024
Lecturers: Dr. S.C. Marine and Graduate Teaching Scholar Priya Mukherjee

Lecture Overview

Focus Areas:
- Protein folding and levels of structure, emphasizing the intricate relationship between amino acid sequence and 3D conformation.
- Sequence specifies conformation (and relevant exceptions), highlighting how certain proteins deviate from this fundamental principle.
Text Reference: Chapter 6 in Tymoczko’s "Biochemistry: A Short Course", which provides foundational knowledge on protein structure and function.

Introduction to Proteins

The precise 3D structure of proteins is absolutely essential for their biological function, dictating their interaction with other molecules and their catalytic activity.
Historically, proteins were viewed as monomorphic polymers (Greek for "single shaped"), believed to assemble into one unique and stable morphology, known as the native state.
Dynamic structure changes significantly influence protein behavior and function, allowing for allosteric regulation, signal transduction, and enzymatic activity modulation.
Most biochemical reactions are acutely pH-sensitive; they typically occur optimally in a narrow pH range to maintain protein stability and reactivity.
pH fluctuations dramatically affect the ionization state and the net charge of amino acid side chains and terminal groups, which critically influences intramolecular interactions and thus overall protein structure.
Protein structure relies heavily on a multitude of non-covalent interactions (e.g., hydrogen bonds, ionic interactions, hydrophobic interactions, van der Waals forces) which are weaker individually but collectively provide great stability.
Proteins, similar to nucleic acids like DNA and RNA, exhibit multiple hierarchical levels of structure, each contributing to the final functional conformation.

Levels of Protein Structure

Primary Structure (1o)

Defined as the linear sequence of amino acids in a single polypeptide chain, from the N-terminus (amino end) to the C-terminus (carboxyl end). This sequence is unique to each protein and is determined by genetic information.
Average sizes:
- Eukaryotic proteins: approximately 472 amino acids (AA), reflecting more complex cellular processes.
- Prokaryotic proteins: approximately 320 AA, generally simpler in composition.
Stabilization is primarily through uncharged peptide bonds, which are amide linkages formed between the carboxyl group of one amino acid and the amino group of another. These bonds permit tight folding and packing due to their rigidity.
The planar nature of peptide bonds prohibits rotation around the C-N bond due to its partial double-bond character (resonance stabilization). However, rotation is possible around the N- $C<em>α$ bond (designated $\phi$ ) and the $C</em>α$ -C bond (designated $\psi$ ), allowing the polypeptide backbone to adopt various conformations.
Peptide bonds are predominantly found in the trans configuration (about $99.6\%$ ) to minimize steric clashes between neighboring side chains (R groups), as the cis configuration would bring bulky R-groups into closer proximity.

Secondary Structure (2o)

Involves localized, repeating structures formed by the regular rotation of single covalent bonds around the α carbon in each amino acid residue. These structures are thermodynamically favorable arrangements of the polypeptide backbone.
Stabilized predominantly by backbone hydrogen bonds formed between the carbonyl oxygen of one peptide bond and the amide hydrogen of another peptide bond, typically 3-4 residues away within the same polypeptide chain.
- Specific Features:
- Peptide bonds between amino acids do not rotate due to their partial double-bond character.
- Covalent bonds within amino acids (specifically the $\phi$ and $\psi$ angles around the $C_α$ carbon) possess rotational freedom, allowing secondary structures to form.
Common secondary structures include:
- β sheet: An extended, zigzag formation composed of two or more β strands, which are segments of a polypeptide chain arranged side-by-side. Side chains alternate above and below the plane of the sheet.
- These sheets are stabilized by hydrogen bonds between backbone atoms of adjacent strands. Most stable forms arise from anti-parallel arrangements, where strands run in opposite directions, resulting in linearly oriented hydrogen bonds, although parallel β sheets also exist.
- α helix: A rod-like structure with a tightly coiled backbone and outwardly extending side chains. It is stabilized by hydrogen bonds formed between the C=O group of residue n and the N-H group of residue n+4. Each turn of an α helix contains $3.6$ amino acid residues and extends about $5.4$ Å ( $0.54$ nm) along the helical axis.
- Certain amino acids such as proline (introduces kinks due to its rigid ring structure and lack of an available amide hydrogen), glycine (high conformational flexibility leading to instability), aspartate, and cysteine (often involved in disulfide bonds or creates steric hinderance) are known as “helix breakers” and cannot be readily accommodated within an α helix without disrupting its stability.

Tertiary Structure (3o)

Represents the overall magnificent 3D structure of a single polypeptide chain, determined by the intricate interactions of side chains (R groups) of amino acids, which may be far apart in the primary sequence but brought into close proximity by folding.
Stabilized chiefly by a diverse array of non-covalent interactions, including powerful hydrophobic interactions (the primary driving force in protein folding, causing nonpolar side chains to bury themselves in the protein interior away from water), ionic bonds (salt bridges between oppositely charged R groups), hydrogen bonds (between polar R groups or between R groups and the backbone), and weak van der Waals forces.
Cross-linking of the polypeptide chain is critical for covalent stabilization and may occur through stable non-covalent salt bridges or robust covalent disulfide bonds, formed between the thiol groups of two cysteine residues ( $2RSH \rightarrow RS-SR + 2H^+ + 2e^-$ , an oxidative process).

Quaternary Structure (4o)

Formed by the precise aggregation of multiple independent polypeptide chains (subunits), each with its own tertiary structure, into a single functional complex. Not all proteins exhibit quaternary structure.
Examples include hemoglobin (four subunits) and enzymes like lactate dehydrogenase (four subunits) or DNA polymerase (multiple subunits).
Stability is derived from the same types of chemical bonds and non-covalent interactions as tertiary structure, including hydrophobic interactions, hydrogen bonds, and ionic interactions. Furthermore, covalent disulfide bonds for cross-linking can occur between adjacent subunits, enhancing the overall stability of the multi-subunit complex.

Visualizing Protein Structures

Flat paper = Primary structure (linear sequence)
Flat, folded paper = Secondary structure (local, repetitive folds like α-helices or β-sheets)
Folded 3D paper = Tertiary structure (the overall 3D shape of a single polypeptide)
Two or more folded, 3D papers = Quaternary structure (assembly of multiple polypeptide subunits)

Post-Translational Modifications

Proteins can be further modified after translation, significantly altering their properties, localization, stability, and interactions.
- Phosphorylation of specific amino acid side chains (primarily serine, threonine, and tyrosine residues), catalyzed by kinases and reversed by phosphatases, is crucial for enzyme regulation, signal transduction pathways, and protein-protein interactions.
- Glycosylation where complex carbohydrate monomers (glycans) attach to the protein exterior, covalently converting it into a glycoprotein and dramatically altering its function, solubility, and cellular localization, particularly important for cell surface recognition.
- Fatty acylation involving lipid monomer attachments (e.g., myristoylation, palmitoylation, prenylation) creates a proteolipid with modified function and localization, often anchoring proteins to cellular membranes.

Glycoproteins

A type of post-translational modification predominantly occurring in the endoplasmic reticulum and Golgi apparatus of eukaryotic cells:
- A monosaccharide or oligosaccharide chain is covalently attached to an amino acid's side chain (R group), typically via the asparagine (N-linked) or serine/threonine (O-linked) residues.
- Functions include facilitating critical cell-to-cell interactions, participating in the complex immune response (e.g., antibodies), serving as biological lubricants (e.g., mucins), and providing structural integrity to the extracellular matrix.
- Often includes common sugars such as GlcNAc (N-acetylglucosamine), GalNAc (N-acetylgalactosamine), mannose, fucose, and sialic acid.

Proteolipids

Another crucial post-translational modification that typically occurs in the endoplasmic reticulum or cytoplasm:
- Various types of fatty acids or lipid moieties are covalently attached to the polypeptide's N- or C-terminus (e.g., myristoylation, prenylation) or via the thiol group of a cysteine’s side chain (e.g., palmitoylation).
- These modifications play vital roles in initiating apoptosis, regulating signal transduction pathways by recruiting proteins to membranes, and acting as stable anchors within the plasma membrane or other cellular membranes, thereby controlling protein localization and activity.

Key Concepts in Protein Folding

Sequence determines conformation: The groundbreaking principle stating that the specific order of amino acids uniquely dictates the protein’s native 3D structure. The native state represents the most thermodynamically stable form under physiological conditions, with the lowest free energy.
- Proteins typically fold into one unique shape (the native fold), a process that is remarkably spontaneous for many proteins and primarily driven by the hydrophobic effect, which minimizes the unfavorable interactions between nonpolar residues and water.
The intricate structure of water (H2O) and its high entropy (a measure of disorder) play a significant role in folding dynamics. The hydrophobic effect is driven by the entropic gain of water molecules when nonpolar surfaces are removed from an aqueous environment.
Protein folding is a progressive process, not random:
- Initially, rapid collapse of the polypeptide chain occurs, driven by the hydrophobic effect, leading to the formation of molten globule states.
- This is followed by the stabilization of near-native conformations, which are slightly more stable due to nascent secondary structure formation and early side chain interactions.
- Subsequent side chain interactions help precisely lock the fluctuating polypeptide into its final, highly specific, and stable 3D shape.
- Molecular chaperones (e.g., GroEL/GroES, Hsp70, Hsp90), often classified as heat shock proteins (Hsp), are essential cellular assistants that bind to unfolded or partially folded proteins, preventing premature aggregation into nonfunctional structures and guiding them towards their correct native conformation.

Exceptions to the Central Principle of Protein Folding

Some proteins defy the notion that a fixed primary structure always dictates a single, rigid tertiary structure, showcasing remarkable structural plasticity:
- Intrinsically unstructured proteins (IUPs), also known as intrinsically disordered proteins (IDPs), do not possess a fixed 3D structure in isolation. Instead, they adapt their forms (undergo a disorder-to-order transition) upon specific interactions with other molecules, playing crucial roles in cellular regulation and signaling. They are characterized by a high proportion of charged and polar amino acids.
- They may contain short “chameleon sequences” that possess the capability to adopt different types of secondary structures (e.g., α-helix in one context, β-strand in another), enabling dramatic shifts in their overall secondary and tertiary structure.
- Metamorphic proteins represent an extreme form of structural plasticity; they have two or more distinct, stable native states of nearly equal energy that are in dynamic equilibrium and can switch between them through significant conformational rearrangement. This reconfiguration can lead to new active sites and entirely different biological functions (e.g., lymphotactin, a chemokine that can exist as a monomeric α-helix and a dimeric β-sheet, with each form having distinct biological roles).

Significant Contributors to Protein Chemistry

Christian Anfinsen: Identified the connection between one globular protein (ribonuclease A) and its native state, demonstrating that the primary sequence contains all the necessary information for folding (1973 Nobel Prize in Chemistry).
Ron Laskey: Coined the term "molecular chaperone" in 1978, describing proteins that assist in the correct folding of other proteins.
Alexey Murzin: Coined the term "metamorphic protein" in 2008, recognizing proteins with multiple stable conformations.
Richard Kriwacki: Identified the intrinsically unstructured protein p21 in 1996, a key cell cycle regulator whose flexibility is essential for its function.

Topics for Examination

Compare the hierarchical levels of structure in proteins and nuclear DNA; note the specific level of structure retained in denatured polymers and why.
Identify and understand peptide bonds, accurately count amino acid residues in a given protein segment, and describe their chemical properties.
Comprehend the fundamental principle that sequence specifies conformation and articulate the key exceptions to this rule (IUPs, metamorphic proteins) with examples.
Distinguish which post-translational modifications (phosphorylation, glycosylation, and fatty acylation) are performed on an amino acid’s side chain versus the polypeptide’s terminal ends, and explain their functional implications.

Next Lecture Preview

Focus will shift to the intricate processes of protein digestion and amino acid metabolism, exploring how proteins are broken down and their constituent amino acids are utilized or catabolized by the body.