Protein Structure and Function - Vocabulary Flashcards

Overview

Proteins are the most diverse macromolecules in terms of structure and function.
A protein’s three-dimensional structure is called its conformation.
Correct folding is critical for function; function is derived from structure.
Chapters cover from primary sequence to quaternary assemblies, plus regulation, folding, motifs, domains, and methods to study proteins.

Monomers, Polymers, and Key Terms

Amino acid: the building block with structure shown (general form includes H, carboxyl, amino, and unique side chain R).
Peptide bond: linkage between amino acids; peptide bonds are planar due to partial double-bond character of the C–N bond.
Monomer to polymer:
- Amino acid → dipeptide → tripeptide → peptide (2–30 residues) → polypeptide (longer chain, up to ~4000+) → protein (a polypeptide folded into a specific conformation).
Memorize amino acids and their side chains (R groups) for understanding folding and interactions.
In diagrams: peptide bond forms between carbonyl carbon of one amino acid and amine nitrogen of the next, producing a repeating backbone with side chains (R groups) protruding outward.

Peptide Bonds and Connectivity (Fig. 3-3 variants)

Peptide bonds are planar because the C–N bond has partial double-bond character.
The carbonyl oxygen and amide nitrogen contribute to the stability and planarity of the peptide bond.
In short peptides, early examples show amide linkage between adjacent amino acids forming the backbone.

Key Terminology (Page 4)

peptide: short chain of amino acids (2–30 residues).
polypeptide: longer chain (up to 4000+ residues).
protein: a polypeptide that has folded into a specific conformation.

Levels of Protein Structure

1) Primary structure

Linear sequence of amino acids.
The specific sequence contains all intrinsic information governing folding into the native 3D structure.
Example sequence: Ala-Glu-Val-Thr-Asp-Pro-Gly-
Primary structure is depicted as a linear chain with N-terminus and C-terminus.

2) Secondary structure

Local folding patterns: alpha (α) helix, beta (β) strand, reverse (β) turns, random coil.
α-helix: right-handed coil stabilized by intramolecular hydrogen bonds between backbone amide N–H and carbonyl O of the amino acid four residues earlier (i to i+4).
β-sheet: formed by β-strands connected laterally by at least two or more backbone hydrogen bonds; strands can be parallel or antiparallel.
Random coil: unstructured regions connecting helices and sheets.
Roughly 60% of an average protein’s 3D structure is due to secondary structure elements.
Notation in figures uses α and β; turns are often displayed as ẞ turns.

3) Tertiary structure

Overall folding of the polypeptide chain into a single 3D structure (a full domain or multiple domains).
Key interactions driving folding:
- Hydrophobic interactions (driving burying of nonpolar side chains in the core).
- Hydrogen bonds (within backbone and sometimes side chains).
- Ionic (electrostatic) interactions.
- van der Waals interactions.
- Disulfide bonds (covalent; sometimes present) between cysteine residues.
Noncovalent interactions are typically dominant; covalent disulfide bonds are optional depending on protein.

4) Quaternary structure

The assembly of multiple polypeptide chains (subunits) into a functional protein complex.
Interactions are similar to those that drive tertiary structure (noncovalent and sometimes covalent).
Example: a trimeric protein consisting of three subunits (HA1/HA2 in hemagglutinin).
Proteins can be monomeric (one subunit) or multimeric (>1 subunit).

Summary Figure (Fig. 3-1a, 3-1b)

Structural levels: Primary → Secondary → Tertiary → Quaternary; these levels can be viewed alongside the concept that regulation and function arise from structure and conformation.

Protein Structure Details and Visuals

Secondary structures can be visualized by backbone traces, ball-and-stick, ribbons, or surface plots.
α-helix specifics:
- 3.6 residues per turn; typical α-helix has a macrodipole because carbonyl oxygens point down and amide nitrogens point up, creating a dipole along the helix axis.
- R groups project from the helix sides; one side can be polar/charged and the opposite hydrophobic (amphipathic helix).
β-sheet specifics:
- 5–8 amino acids per strand; hydrogen bonds between backbone amide and carbonyl groups hold strands together, forming a pleated sheet.
- Strands can be parallel or antiparallel; side chains project above and below the sheet.
Reverse turns (β turns): typically 4 amino acids forming a U-turn; glycine and proline are common in turns (special amino acids or helix breakers).
Random coils/loops: connect helices and β-sheets; lack defined structure but can be functionally important.
Amphipathic helices: hydrophobic and polar faces; used in interactions and membranes (e.g., Leucine zippers).

Protein Motifs and Domains (Protein Structural Motifs)

Motif: a region containing various secondary structures with a defined 3D topology and a function; motifs recur in different proteins.
Domain: larger, functional modules that may be composed of one or more motifs; domains can be structural, functional, or topological.
Four main domain classes:
- Functional domain: catalytic, DNA-binding, RNA-binding, ligand-binding, Ca2+-binding, etc.
- Structural domain: collection of secondary structures folded into a distinct structure (often at least ~40 amino acids).
- Topological domain: membrane-related partitions (extracellular, cytosolic, membrane-spanning regions).
- Domain definitions can be based on enrichment of specific amino acids (acidic, basic, hydrophobic, etc.).
Examples of motifs:
- Coiled-coil motif (leucine zipper): amphipathic helices, forming dimers; Leu every 7th residue; often used in DNA-binding proteins.
- EF-hand/Helix-loop-helix (HLH) motif: Ca2+-binding regulatory proteins; contains Ca2+-binding loops and helices.
- Zinc-finger motif: often DNA-binding; consists of Cys and His coordinating Zn2+.

Hemagglutinin Example and Domain Modularity

Hemagglutinin (HA) comprises distal globular domain and proximal fibrous domain; HA1 and HA2 subunits form a trimer in the viral membrane.
Protein domains are modular and can be found in different proteins; cleaved domains can still fold and function.
EGF precursor, integrins, and other proteins illustrate how domains (modules) can be combined in different proteins to achieve diverse functions.

Protein Folding Principles

The Hydrophobic Effect (oil-drop model): hydrophobic side chains cluster in the protein core; water molecules reorganize to minimize exposed hydrophobic surface, driving folding and stability.
Folding is thermodynamically favorable through multiple steps: secondary structures form first, followed by domain formation, then final tertiary structure.
There are many potential folding pathways, but a native state is usually reached as the most thermodynamically stable conformation.
Number of possible folds grows geometrically with chain length (e.g., a protein with 100 AAs could theoretically fold in up to $2 \times 10^{90}$ ways), yet real proteins fold into a single native state.
Folding is aided by cellular factors:
- Molecular chaperones (e.g., Hsp70, Hsp90) stabilize unfolded/partially folded proteins and assist folding using ATP cycles.
- Chaperonins (GroEL/GroES in bacteria; TRiC in eukaryotes) form barrel-like chambers that provide an isolated folding environment.
Disulfide bonds can stabilize folding by covalently linking cysteine residues.
Proline isomerases (cis-trans isomerases) catalyze cis/trans isomerization at X-Pro peptide bonds, influencing folding kinetics.
Denaturation/renaturation experiments show that many proteins can refold, but some do not renature easily; denaturing agents include urea, guanidine HCl, salt, and pH changes.

Protein Misfolding and Disease

Abnormally folded proteins can aggregate into amyloid fibrils and plaques, contributing to diseases such as Alzheimer's, Parkinson's, prion diseases, and various systemic amyloidoses.
Misfolded proteins are normally degraded, but accumulation can lead to pathology.
The aggregation often involves cross-β sheet architectures in amyloids.

Functional Design of Proteins

Ligand: a molecule that binds a protein with high specificity (lock-and-key analogy).
Substrate: reactant that binds an enzyme.
Binding can induce conformational changes (induced fit) that optimize catalysis or signaling.
Affinity: strength of protein-ligand binding; quantified by equilibrium constant Keq or dissociation constant Kd (lower Kd = higher affinity).
Specificity: ability to distinguish between ligands; designed by complementary shapes and chemistries.
Enzymes act by stabilizing the transition state, thereby lowering activation energy and increasing reaction rates.
Active site: region where substrate binds; catalytic site: region where chemical transformation occurs.
Example: Protein Kinase A (PKA) illustrates induced fit and domain movements in catalysis with a glycine lid and nucleotide-binding pocket; serine/threonine residues targeted by phosphorylation.
The concept: enzymes lower activation energy by binding the transition state (a pentavalent transition state with Mg2+ coordination is shown in the PKA example).

Enzyme Kinetics (Bare-bones minimum)

Km (Michaelis constant): a measure of substrate affinity; inversely related to affinity.
Vmax: maximum velocity at saturating substrate concentration; related to enzyme turnover number.
Turnover number (kcat): rate constant for substrate processing per enzyme molecule per second.
Michaelis-Menten equation (steady-state):
$v = \frac{V<em>{max} [S]}{K</em>m + [S]}.$
For multiple substrates, Km values differ between substrates; higher affinity corresponds to a lower Km for that substrate.

Regulation of Protein Function

Environmental regulation: pH affects enzyme activity (e.g., lysosomal enzymes optimum near pH ~5; cytosolic pH is higher).
Physical association of catalytic domains: scaffolding proteins can bring enzymes into proximity, enabling sequential reactions.
Multiple catalytic domains within one protein can enable coordinated catalysis.
Degradation and turnover:
- Extracellular proteases (e.g., trypsin, chymotrypsin) and exopeptidases (aminopeptidases, carboxypeptidases).
- Intracellular pathways: lysosomal degradation at low pH; ubiquitin-proteasome pathway in cytosol.
- Ubiquitin pathway: ubiquitination tags proteins for proteasomal degradation via a cascade of E1, E2, and E3 enzymes; chains can also signal in regulation (e.g., signaling, immune functions).
Regulatory subunits vs catalytic subunits in protein function (e.g., PKA): regulatory subunits bind ligands (e.g., cAMP) and control catalytic subunits via conformational changes.
Allosteric (noncovalent cooperative) binding:
- Example: PKA regulatory subunits cooperatively bind 4 cAMP molecules; hemoglobin cooperatively binds 4 O2 molecules.
- Allosteric switches can turn activity on or off via ligand-induced conformational changes.
Allosteric switches examples:
- Ca2+ binding to calmodulin (Ca2+-calmodulin activates target peptides via four Ca2+-binding EF-hand motifs).
- Guanine nucleotide-binding switch proteins (Ras, Ran) use GTPase cycling controlled by GEFs (guanine-nucleotide exchange factors) and GAPs (GTPase-activating proteins).
Post-translational covalent modifications:
- Phosphorylation: reversible, typically on Tyr, Ser, or Thr; regulated by kinases and phosphatases.
- Ubiquitination: covalent attachment of ubiquitin can signal degradation or regulatory roles; poly-ubiquitination often targets for proteasome; isopeptide bonds form between ubiquitin and lysine residues on target proteins.
- Proteolytic cleavage/processing: proprotein processing to activate function (e.g., proinsulin to insulin; furin, PC2/PC3 proteases); regulated processing in the secretory pathway.
- Other covalent modifications can alter function, localization, and interactions.
Important: post-translational modifications can regulate protein activity beyond degradation (signaling, localization, and complex formation).

Protein Degradation Pathways

Extracellular proteolysis: digestive proteases like trypsin and chymotrypsin; exopeptidases trim residues from ends of proteins.
Intracellular degradation:
- Lysosomal pathway: hydrolytic enzymes operate in acidic environments (pH ~5).
- Proteasome pathway: cytosolic degradation via ubiquitin tagging (E1, E2, E3 enzymes) leading to proteasomal proteolysis.
Ubiquitin details:
- Ub is a 76-aa protein; ubiquitination involves E1 (activating), E2 (conjugating), E3 (ligase) enzymes.
- Ub is recycled by deubiquitinases (DUBs) before degradation.
- Destruction boxes (signal sequences) on target proteins direct ubiquitination.
Visual: Ub chains attach to lysines on targets; proteasome unfolds and digests into peptides.

Post-Translational Modifications and Processing (Examples)

Cleavage/Processing in the secretory pathway:
- Prohormone processing (e.g., proinsulin to insulin) by proteases such as furin and PC family proteases; disulfide bonds can stabilize intermediates.
Intein splicing (self-splicing proteins): rare auto-catalytic events where an intron within a protein is excised; not likely to be on exams here.

Protein Purification and Analytical Techniques (Methods)

Purification principles: proteins differ by size, charge, solubility, stability, and binding affinity; purification exploits these properties.
Centrifugation:
- Differential centrifugation separates by mass/density; larger particles pellet earlier; smaller particles remain in supernatant.
- Rate-zonal centrifugation uses a density gradient (e.g., sucrose) to separate particles by mass as they migrate through gradient layers.
- Sedimentation coefficient S (Svedberg units) depends on mass, shape, and density; used to estimate size and mass.
Equilibrium density-gradient centrifugation: separates organelles by density in a gradient; useful for organelle purification and DNA purification.
Electrophoresis:
- SDS-PAGE denatures proteins and coats them with negative charge; separates by size as they migrate through polyacrylamide gel.
- Two-dimensional gel electrophoresis combines IEF (isoelectric focusing) by pI in first dimension and SDS-PAGE by size in second dimension.
- Isoelectric point (pI) is the pH at which a protein has no net charge.
Chromatography:
- Gel filtration (size-exclusion): separates by size; larger proteins elute first.
- Ion-exchange chromatography: separates by charge; depends on protein’s net charge and salt gradient for elution.
- Affinity chromatography: exploits specific binding interactions (e.g., antibody–antigen; enzyme–inhibitor);
  eluted by changing pH or competing ligands.
Detection methods:
- Western blotting (immunoblot): antibodies detect specific proteins after gel transfer.
- Immunoprecipitation (IP) to pull down proteins and interactors for analysis.
Radioisotopes and labeling:
- 32P, 35S, 125I, 14C, 3H used for tracking molecules; pulse-chase experiments study dynamics of synthesis, processing and turnover.
- Pulse-chase example: use 35S-Met to label newly synthesized proteins briefly, then chase with unlabeled Met to observe maturation, trafficking, and degradation.
Protein sequencing and identification:
- Edman degradation: sequential N-terminal amino acid identification via HPLC.
- DNA sequencing and in silico translation to deduce protein sequence.
- Mass spectrometry (MS): peptide mass fingerprint after proteolysis; MS/MS to sequence peptides; MALDI-TOF and ESI (electrospray) approaches.
- 2D MS (MS/MS) can derive amino acid sequences from peptides.
In vitro peptide synthesis: short peptides (10–100 aa) for antibody production, substitution studies, and folding analyses; iterative protection/deprotection chemistry used to build sequences.

Protein Structure Determination Technologies

X-ray crystallography: determines precise 3D structure from diffraction patterns; requires crystals; Fourier transform converts diffraction spots into 3D structure.
Cryo-electron microscopy (cryo-EM): high-resolution structure from images of frozen-hydrated samples; suitable for large complexes.
Nuclear magnetic resonance (NMR): uses magnetic fields to determine distances and angles between atoms; best for smaller proteins/domains (< ~20 kDa).
The proteomics workflow often combines LC-MS/MS with database searching to identify proteins in complex samples.

Protein Families, Evolution, and Heme Cofactors

Proteins in a family share a common evolutionary ancestor; homologous proteins show high sequence/structure similarity across species.
Evolutionary trees illustrate divergence of related proteins (e.g., globins like hemoglobin and myoglobin across vertebrates and invertebrates).
Heme as a cofactor (prosthetic group) in hemoproteins (e.g., hemoglobin, myoglobin, leghemoglobin) providing oxygen binding capability.

Protein Domains and Modularity

Proteins are modular: domains are larger functional blocks that can be swapped or recombined across proteins.
A domain can be defined independently by its topology and function; folding of domains can be autonomous when isolated.
Examples: Epidermal Growth Factor (EGF) domains, Ig-like domains, transmembrane domains, Kringle domains, etc.
The modular design enables domain shuffling and creation of multidomain proteins with diverse functions.

Four Broad Structural Categories of Proteins (Ch. 3)

Globular proteins: roughly spherical, compact, often enzymes or transport proteins (e.g., myoglobin).
Fibrous proteins: elongated, structural roles (e.g., keratin, collagen).
Integral membrane proteins (IMPs): span membranes with hydrophobic regions (often α-helical in membranes).
Intrinsically disordered proteins: lack a single native state until bound; can be induced to fold by interactions; important in signaling and regulation; often involved in transient complexes.
Note: proteins can contain regions that belong to different categories (e.g., a single protein may have ordered domains and disordered segments).

Allosteric Regulation and Signaling

Allosteric regulation involves noncovalent cooperative binding that changes the protein’s conformation and activity.
Calmodulin model: Ca2+ binding to calmodulin leads to conformational changes that regulate target interactions; involves four EF-hand motifs.
Ras/Ran GTPases: regulatory switches controlled by GEFs and GAPs; bidirectional control by exchange of GDP for GTP and hydrolysis, not by phosphorylation alone.
Cyclic nucleotides (cAMP) and kinases: cAMP binding to regulatory subunits can activate catalytic subunits (as in PKA) via conformational shifts.
Allosteric switches allow proteins to respond to cellular signals with high cooperativity and precision.

Posttranslational Modifications: Broad Roles

Phosphorylation: reversible addition of phosphate groups by kinases; modulates activity, interactions, localization; often on Ser/Thr/Tyr.
Ubiquitination: tagging with Ub to signal degradation or regulatory roles; involves E1, E2, E3; can form chains; reversible by deubiquitinases.
Proteolytic processing: activation of proproteins (e.g., insulin, proinsulin); occurs via specific proteases in secretory pathways (furin, PC family).
Covalent modifications beyond ubiquitination: other isopeptide bond formations and covalent crosslinks that regulate function and interactions.

Key Takeaways: Structure Governs Function

The central thesis: structure determines function; proteins adopt conformations that enable specific binding, catalysis, signaling, and mechanical work.
Function emerges from the integrated behavior of primary sequence, secondary structures, domain organization, and quaternary assembly.
Stability and dynamics (breathing) allow conformational changes required for activity and regulation.

Quick Reference: Core Equations and Concepts

Michaelis–Menten kinetics (enzyme-c substrate reactions):
$v = \frac{V<em>{max} [S]}{K</em>m + [S]}$
Km reflects affinity: lower Km indicates higher affinity for substrate.
v at saturating substrate equals Vmax.
3.6 residues per turn in an α-helix; take note for helices and dipole considerations.
Sedimentation coefficient S and its relevance to protein size and shape in centrifugation.

Connecting to Practice and Real-World Relevance

Enzymes as catalysts: lower activation energy by stabilizing transition states; relevance to drug design and biotechnology.
Protein purification and proteomics: isolating and identifying proteins from complex mixtures to study function, interactions, and disease mechanisms.
Structural determination techniques enable drug design, understanding disease mutations, and engineering proteins with novel functions.
Misfolding and amyloids are central to several neurodegenerative and systemic diseases; understanding folding pathways informs therapeutic strategies.

Summary: Hierarchy and Connectivity

Amino acids → primary structure → local secondary structures → tertiary structure (domain formation) → quaternary structure (multimeric assemblies) → function and regulation.
Modular design and domains enable functional diversity and evolutionary plasticity.
Regulation occurs at multiple levels: conformational changes, allosteric binding, covalent modifications, and controlled degradation.
Analytical and purification techniques allow the isolation, characterization, and identification of proteins within complex biological systems.