Protein Domains: Structure, Function, and Modularity
Domains: modular regions within proteins
- Domains are separate from primary, secondary, tertiary, and quaternary structure
- Definition: a region of a protein that has its own discrete fold. If you separate that chunk of the polypeptide, it would fold in exactly the same way. Thus, a domain sits between secondary structure and the overall tertiary fold; it can be made from different secondary structures (often a mix of alpha helices and beta sheets) but has its own characteristic fold
- Domains generally have their own functions too, such as:
- enabling dimerization (protein–protein interactions)
- helping localize the protein by binding other molecules (proteins, DNA, lipids)
- having enzymatic activity
- Many proteins are composed of multiple domains, contributing to modular construction of protein function
SRC protein kinase: a concrete domain example
- SRC (sarcoma protein kinase) is a signaling enzyme with three domains, color-coded in diagrams: a C-terminal kinase domain (yellow/orange), an SH2 domain (blue), and an SH3 domain (green)
- Kinase domain (C-terminal):
- ATP is sandwiched between two lobes; this domain functions as a kinase and carries out phosphorylation
- The domain is highly structured and contains both alpha helices and beta sheets
- SH2 domain (blue): binds phosphotyrosine residues
- SH3 domain (green): binds sequences containing proline and hydrophobic amino acids
- SH2 and SH3 domains provide regulatory functions for SRC; these domains are common in many proteins and will be revisited in later topics on protein modification and regulation
Visual representations of protein domains
- SH2 domain (approx. 100 amino acids) shown in four representations to illustrate folds
- Backbones model (top-left): used for overlays of domain folds; shows only backbone carbons and nitrogens
- Ribbon model (top-right): highlights secondary structure; makes alpha helices and beta sheets visually distinct
- Space-filling model (bottom-right): uses van der Waals radii to show how much space the domain occupies
- Wireframe model (bottom-left): shows amino acid side chains; useful for inspecting active sites and interaction surfaces
- Often, many proteins or domains are shown as a mix of these representations (e.g., bulk of the protein in backbone or ribbon, with residues at active sites shown as wireframe or substrate in space-fill)
Additional domain examples (visual slides)
- Cytochrome B562 (left): single-domain protein composed of alpha helices; involved in electron transport; shown using ribbon representation
- NAD-binding domain of lactate dehydrogenase (center): core contains a mix of alpha helices and beta sheets
- Immunoglobulin variable domain (right): beta-sheet structure, largely antiparallel; contains unstructured regions (linkers) represented in yellow that connect adjacent secondary structure elements
- Unstructured regions (linkers): flexible sequences that connect helices to sheets or sheets to sheets, enabling dynamic interactions
Takeaways about domains
- Domains are generally small, modular parts of proteins that can be composed of alpha helices, beta sheets, or a mix
- Each domain has its own fold and function, contributing to the overall properties of the protein
Homeodomain: a DNA-binding domain and evolutionary conservation of fold
- Homeodomain (DNA-binding domain) shown in ribbon (left) and backbone overlay (right)
- Ortholog comparison: yeast vs Drosophila (two billion years of evolution, 2×109 years)
- Sequence conservation is low yet structural fold is highly conserved:
- 60 amino acids examined, with only rac1760 identical ≈ 0.283 (about 28.3%)
- Despite this, the backbone overlay shows nearly identical fold, indicating that primary sequence can diverge while the domain fold remains conserved
- Concept reinforced: different amino acid sequences can converge on a conserved structural fold at the level of domains
Modularity and repeated domains in proteins
- Fibronectin example: extracellular matrix protein composed of four adjacent, highly similar domains (fibronectin type III domains)
- These four domains are practically identical due to tandem duplication at the genomic level
- Concept: tandem duplication increases the number of identical domains in a protein
- Similar phenomena occur with cadherins (cell–cell adhesion proteins) showing repeated domains
- Domain architecture as a recurrent theme in extracellular and signaling proteins
Domain shuffling: creating multi-domain proteins through genetic rearrangements
- Domain shuffling slides show multiple proteins built from a combination of domains
- Mechanism: accidental joining of DNA sequences encoding different domains during evolution; if the new gene/protein is useful, it is conserved
- Visual takeaway: domains act as building blocks shared across many proteins; proteins—especially those involved in signaling—often assemble from common domain modules found across different genes
A classic multi-domain example set: proteases with shared domains
- Five proteins shown, except for EGF (growth factor) at the top, are all proteases with a common protease domain (brown) at the C-terminus
- Examples:
- Chymotrypsin (simple digestive enzyme): protease domain alone, with no other domains
- Urokinase, Factor IX, Plasminogen: multi-domain proteases with additional regulatory domains
- Factor IX: multi-domain architecture with
- Calcium-binding domain (yellow) that enables binding to phospholipids in a calcium-dependent fashion
- Two EGF-like domains (green) that facilitate binding to tissue factor on sub-endothelial cells and platelets, directing activity to the right place at the right time during blood clotting
- Plasminogen: protease domain plus five kringle domains (blue) which mediate binding to clots and localization of activity; enables breakdown of clots
- The protease domain count and placement (often at the C-terminus) demonstrate how domain shuffling can position catalytic domains with regulatory or targeting domains to achieve precise control of activity
Final summary and implications
- Domains are a separate class of structural organization from the classic four levels of structure; they are folding units that can fold independently
- They provide specific properties: catalytic activity, binding to other proteins or molecules, or regulatory roles
- Domain sharing is common: the same or similar domains appear in many different proteins due to domain shuffling and duplication
- The modular nature of domains underpins evolution of complex signaling networks and multifunctional enzymes
- Takeaway: understanding domains helps explain protein function, evolution, and how multi-domain proteins achieve precise spatial and temporal control of activity
Looking ahead
- Next video topic: covalent modification of proteins and protein regulation (to connect domain structure with regulation and control of activity)