Protein Domains: Structure, Function, and Modularity

Domains: modular regions within proteins

  • Domains are separate from primary, secondary, tertiary, and quaternary structure
  • Definition: a region of a protein that has its own discrete fold. If you separate that chunk of the polypeptide, it would fold in exactly the same way. Thus, a domain sits between secondary structure and the overall tertiary fold; it can be made from different secondary structures (often a mix of alpha helices and beta sheets) but has its own characteristic fold
  • Domains generally have their own functions too, such as:
    • enabling dimerization (protein–protein interactions)
    • helping localize the protein by binding other molecules (proteins, DNA, lipids)
    • having enzymatic activity
  • Many proteins are composed of multiple domains, contributing to modular construction of protein function

SRC protein kinase: a concrete domain example

  • SRC (sarcoma protein kinase) is a signaling enzyme with three domains, color-coded in diagrams: a C-terminal kinase domain (yellow/orange), an SH2 domain (blue), and an SH3 domain (green)
  • Kinase domain (C-terminal):
    • ATP is sandwiched between two lobes; this domain functions as a kinase and carries out phosphorylation
    • The domain is highly structured and contains both alpha helices and beta sheets
  • SH2 domain (blue): binds phosphotyrosine residues
  • SH3 domain (green): binds sequences containing proline and hydrophobic amino acids
  • SH2 and SH3 domains provide regulatory functions for SRC; these domains are common in many proteins and will be revisited in later topics on protein modification and regulation

Visual representations of protein domains

  • SH2 domain (approx. 100100 amino acids) shown in four representations to illustrate folds
    • Backbones model (top-left): used for overlays of domain folds; shows only backbone carbons and nitrogens
    • Ribbon model (top-right): highlights secondary structure; makes alpha helices and beta sheets visually distinct
    • Space-filling model (bottom-right): uses van der Waals radii to show how much space the domain occupies
    • Wireframe model (bottom-left): shows amino acid side chains; useful for inspecting active sites and interaction surfaces
  • Often, many proteins or domains are shown as a mix of these representations (e.g., bulk of the protein in backbone or ribbon, with residues at active sites shown as wireframe or substrate in space-fill)

Additional domain examples (visual slides)

  • Cytochrome B562 (left): single-domain protein composed of alpha helices; involved in electron transport; shown using ribbon representation
  • NAD-binding domain of lactate dehydrogenase (center): core contains a mix of alpha helices and beta sheets
  • Immunoglobulin variable domain (right): beta-sheet structure, largely antiparallel; contains unstructured regions (linkers) represented in yellow that connect adjacent secondary structure elements
  • Unstructured regions (linkers): flexible sequences that connect helices to sheets or sheets to sheets, enabling dynamic interactions

Takeaways about domains

  • Domains are generally small, modular parts of proteins that can be composed of alpha helices, beta sheets, or a mix
  • Each domain has its own fold and function, contributing to the overall properties of the protein

Homeodomain: a DNA-binding domain and evolutionary conservation of fold

  • Homeodomain (DNA-binding domain) shown in ribbon (left) and backbone overlay (right)
  • Ortholog comparison: yeast vs Drosophila (two billion years of evolution, 2×109 years2\times 10^9\text{ years})
  • Sequence conservation is low yet structural fold is highly conserved:
    • 60 amino acids examined, with only rac1760rac{17}{60} identical ≈ 0.283 (about 28.3%)
    • Despite this, the backbone overlay shows nearly identical fold, indicating that primary sequence can diverge while the domain fold remains conserved
  • Concept reinforced: different amino acid sequences can converge on a conserved structural fold at the level of domains

Modularity and repeated domains in proteins

  • Fibronectin example: extracellular matrix protein composed of four adjacent, highly similar domains (fibronectin type III domains)
    • These four domains are practically identical due to tandem duplication at the genomic level
    • Concept: tandem duplication increases the number of identical domains in a protein
    • Similar phenomena occur with cadherins (cell–cell adhesion proteins) showing repeated domains
  • Domain architecture as a recurrent theme in extracellular and signaling proteins

Domain shuffling: creating multi-domain proteins through genetic rearrangements

  • Domain shuffling slides show multiple proteins built from a combination of domains
  • Mechanism: accidental joining of DNA sequences encoding different domains during evolution; if the new gene/protein is useful, it is conserved
  • Visual takeaway: domains act as building blocks shared across many proteins; proteins—especially those involved in signaling—often assemble from common domain modules found across different genes

A classic multi-domain example set: proteases with shared domains

  • Five proteins shown, except for EGF (growth factor) at the top, are all proteases with a common protease domain (brown) at the C-terminus
  • Examples:
    • Chymotrypsin (simple digestive enzyme): protease domain alone, with no other domains
    • Urokinase, Factor IX, Plasminogen: multi-domain proteases with additional regulatory domains
  • Factor IX: multi-domain architecture with
    • Calcium-binding domain (yellow) that enables binding to phospholipids in a calcium-dependent fashion
    • Two EGF-like domains (green) that facilitate binding to tissue factor on sub-endothelial cells and platelets, directing activity to the right place at the right time during blood clotting
  • Plasminogen: protease domain plus five kringle domains (blue) which mediate binding to clots and localization of activity; enables breakdown of clots
  • The protease domain count and placement (often at the C-terminus) demonstrate how domain shuffling can position catalytic domains with regulatory or targeting domains to achieve precise control of activity

Final summary and implications

  • Domains are a separate class of structural organization from the classic four levels of structure; they are folding units that can fold independently
  • They provide specific properties: catalytic activity, binding to other proteins or molecules, or regulatory roles
  • Domain sharing is common: the same or similar domains appear in many different proteins due to domain shuffling and duplication
  • The modular nature of domains underpins evolution of complex signaling networks and multifunctional enzymes
  • Takeaway: understanding domains helps explain protein function, evolution, and how multi-domain proteins achieve precise spatial and temporal control of activity

Looking ahead

  • Next video topic: covalent modification of proteins and protein regulation (to connect domain structure with regulation and control of activity)