BIOL4210 – Lecture 3: Proteins – Vocabulary Flashcards

Amino Acids: Building Blocks of Proteins

General formula of an amino acid
- Central alpha carbon (Cα) attached to four substituents: a hydrogen (H), an R (side-chain) group, an amino group (–NH₂), and a carboxyl group (–COOH).
- At physiological pH (≈7), both the amino and carboxyl groups are ionized: the amino group is protonated (–NH₃⁺) and the carboxyl group is deprotonated (–COO⁻).
- Representation at pH 7 can be summarized as: $\mathrm{NH_3^+}\quad \text{and}\quad \mathrm{COO^-}$ attached to the same Cα.
Chirality and labeling
- Most amino acids (except glycine) have an asymmetric Cα, allowing enantiomers (L- and D-forms).
- In proteins, only L-amino acids are used. D-amino acids do occur in some bacterial cell walls but not in bacterial proteins; we focus on L-amino acids unless stated otherwise.
- Glycine is achiral because its side chain is hydrogen; it is the only amino acid without a chiral center.
Amino acid side chains define identity
- Each of the 20 standard amino acids has a distinct R-group which determines its chemical properties and behavior in proteins.
- For coding and communication, amino acids have:
- Full name
- 3-letter abbreviation
- 1-letter abbreviation
- Examples:
- Alanine: Ala, A
- Glycine: Gly, G
- Phenylalanine: Phe, F
- The 3-letter abbreviations are generally intuitive; 1-letter abbreviations often correlate with the amino acid name, but not always (e.g., Tryptophan -> W; Glutamic acid -> E).
Importance of memorization
- Memorizing the 3-letter and 1-letter abbreviations is useful for exams, publications, and course work.
Grouping amino acids by side-chain properties
- Polar vs nonpolar (hydrophilic vs hydrophobic tendencies).
- Polar group subdivided by charge:
- Polar uncharged
- Polar charged: acidic (negative) or basic (positive)
- Examples discussed:
- Basic (positively charged) side chains: Lysine (Lys, K), Arginine (Arg, R), Histidine (His, H)
  - Arginine annotation: “this group is very basic because its positive charge is stabilized by resonance.”
  - Lysine can undergo post-translational modifications (PTMs): acetylation (addition of acetyl group) and ubiquitination (attachment of ubiquitin via an isopeptide bond).
- Acidic (negatively charged) side chains: Aspartic acid (Asp, D) and Glutamic acid (Glu, E)
  - Both contain carboxyl groups in their side chains.
- Nonpolar (hydrophobic) side chains: e.g., Glycine (Gly, G), Methionine (Met, M), Leucine (Leu, L), etc.
  - Glycine is the simplest; its side chain is just H and it lacks a bulky group, contributing to flexibility.
  - Methionine is one of only two amino acids containing sulfur; it is encoded by the start codon AUG and often the first amino acid at the extreme N-terminus of proteins. It can be cleaved off in mature proteins, but methionine is still frequently the first residue.
  - Methionine being start-codon encoded means it is commonly at the N-terminus; in mature proteins, it may be removed.
  - Cysteine (Cys, C) also contains sulfur and can form disulfide bonds (see below).
- Special cases and other notes:
  - Proline (Pro, P) is technically an imino acid (not a standard amino acid) because its side chain forms a ring that bonds to the amino nitrogen, creating a 5-membered ring.
  - This makes proline relatively rigid and it disrupts secondary structure, often acting as a helix breaker and is commonly found at or near start/end of α-helices.
  - Phenylalanine (Phe, F) and Tyrosine (Tyr, Y) both contain a benzene ring; Tyrosine adds a hydroxyl group to the ring.
  - These two also function as precursors to monoamine neurotransmitters: Phenylalanine contributes to tyrosine, which is a precursor to dopamine, norepinephrine, and epinephrine.
  - Tyrosine is also a precursor for melanin synthesis.
  - Uncharged polar side chains: Serine (Ser, S), Threonine (Thr, T), and Tyrosine (Tyr, Y) possess polar groups (e.g., hydroxyl groups) that enable hydrogen bonding.
  - These residues can be sites of post-translational modification, most notably phosphorylation by kinases to form phosphoester bonds, which can be removed by phosphatases.
Summary table and visuals
- A comprehensive summary table lists the 20 amino acids with their full names, 3-letter abbreviations, and 1-letter abbreviations.
- The amino acids are displayed in two broad groups (polar vs nonpolar) with polar further subdivided into acidic, basic, and uncharged.
- A goal is to recognize all amino acid structures from memory, aided by the table.

Protein Primary Structure and Folding: From Sequence to Shape

Short polypeptide as example
- Proteins are long, unbranched, linear chains of amino acids linked head-to-tail by peptide bonds.
- The direction of synthesis is from N-terminus to C-terminus during translation.
- Example sequence shown: methionine → aspartic acid → leucine → tyrosine (N- to C-terminus).
- The N-terminus is the left end; the C-terminus is the right end.
Primary structure determines folding and domains
- The identity and properties of each amino acid in the linear sequence influence how the chain folds into secondary and tertiary structures.
Backbone architecture and phi/psi angles
- Each amino acid contributes three backbone bonds: the peptide bond, the N–Cα bond, and the Cα–C(=O) bond.
- Peptide bonds are rigid; rotation occurs around the two bonds adjacent to the α-carbon, characterized by the dihedral angles φ (phi) and ψ (psi).
- The angles φ and ψ for each residue influence the local and overall folding of the protein.
Factors influencing folding beyond backbone angles
- Side-chain properties affect conformational flexibility and steric hindrance.
- Bulky side chains can constrain conformations; proline is particularly rigid and reduces the likelihood of forming regular secondary structure like α-helices.
- Noncovalent interactions (beyond the backbone) contribute to stabilization (see next section).
Secondary structure motifs and backbone hydrogen bonding
- α-helix and β-sheet are the two main canonical motifs, stabilized primarily by hydrogen bonds between backbone N–H and C=O groups.
- Side-chain interactions are less critical for forming these motifs but can influence the stability and geometry.
- Historical notes:
- α-helix discovery traced to α-keratin (hair, nails).
- β-sheet identification traced to fibronin (silk).
Hydrophobic effect and protein folding
- Nonpolar (hydrophobic) side chains tend to cluster in the protein core, away from water, driving compact folding.
- Polar (hydrophilic) side chains tend to be exposed to solvent, forming interactions with water.
- This segregation into a hydrophobic core and polar surface stabilizes the folded state.
Coiled-coils and higher-order motifs
- Some α-helices display a hydrophobic stripe on one side (e.g., residues A and D in a repeating heptad G sequence) which can drive helix-wrapping around another helix with a complementary stripe to form a coiled-coil.
- Coiled-coils are common in structural proteins (e.g., α-keratin) and some transcription factors.
Serine proteases: elastase and chymotrypsin as folding exemplars
- Elastase and chymotrypsin share a highly similar fold (structure) despite divergent amino-acid sequences.
- Only a subset of residues (green-labeled in a figure) are identical; the overall fold is conserved while sequence varies.
- Take-away: fold and function can be conserved even with substantial primary sequence variation.
Quaternary structure and protein complexes
- Separate proteins can form stable interactions to create complexes (quaternary structure).
- Example: Cro repressor from bacteriophage λ forms a dimer (two subunits, green and red) with a binding interface (yellow) that involves noncovalent interactions (hydrophobic interactions, van der Waals forces).
- Definitions:
- Homodimer: two identical proteins.
- Heterodimer: two different proteins.
- Multimers (e.g., trimer, tetramer, hexamer) are complexes of multiple subunits.
Disulfide bonds and cellular context
- Cysteine residues can form disulfide bonds (S–S) within or between polypeptide chains.
- Interchain disulfide bonds link separate polypeptides; intrachain bonds link two regions within the same polypeptide.
- Cellular environment matters:
- Cytosol is reducing, so disulfide bonds are uncommon inside cells.
- Disulfide bonds are common in secreted proteins (e.g., insulin, immunoglobulins) and on cell surfaces, formed in the endoplasmic reticulum during processing.
- In laboratory settings, reducing agents like β-mercaptoethanol are used to break disulfide bonds and linearize proteins for analysis.

Protein Domains: Modular, Independently Folding Units

What is a domain?
- A domain is a region of a protein that folds independently and has its own discrete fold, sitting between secondary structure elements and the overall tertiary structure.
- A protein may be composed of multiple domains, each with its own structure and often function.
- Domains can be built from different secondary structures (α-helices, β-sheets, or both).
- Domains frequently carry their own functions (e.g., enabling dimerization, binding DNA/lipids, or harboring enzymatic activity).
Src protein kinase as a paradigmatic example
- Src kinase contains three domains: a C-terminal catalytic kinase domain (ATP-binding pocket) and two regulatory domains, SH2 and SH3.
- SH2 binds phosphotyrosine residues; SH3 binds proline-rich sequences with hydrophobic residues.
- Structural representations of domains can vary: backbone model, ribbon, space-filling, and wireframe; often a mixed representation is used in figures.
Visualizing and comparing domains
- SH2 domain visualization shows differences in representation styles (backbone, ribbon, space-filling, wireframe) to highlight fold, active sites, and interaction surfaces.
- Examples of single-domain proteins: Cytochrome b562 (α-helical bundle), NAD-binding domain of lactate dehydrogenase (α/β mix), and the immunoglobulin variable domain (β-sheet core).
- Orthologue comparison: overlaying domains from distant species (e.g., yeast vs Drosophila) reveals highly conserved folds despite low sequence identity (e.g., only a subset of residues identical; the fold remains nearly identical).
Modularity and domain architecture in larger proteins
- Tandem duplication can create repeated identical domains within a protein, as seen in fibronectin (four consecutive Fn3 domains) and cadherins (repeated cadherin domains).
- Domain shuffling: evolution can join DNA encoding different domains to create new multi-domain proteins with novel functions.
- Many proteins implicated in signaling or digestion are built from multiple domains; proteases often combine a catalytic protease domain with regulatory or targeting domains (e.g., factor IX with a calcium-binding domain and EGF domains; plasminogen with a serine protease domain and multiple Kringle domains).
Take-home messages about domains
- Domains are separate, independently folding modules that endow specific properties (enzymatic activity, binding capabilities, regulatory roles).
- Many proteins share the same or similar domains due to domain duplication and domain shuffling, contributing to functional diversity across the proteome.
Examples that illustrate domains in action
- Fibronectin: extracellular matrix protein composed of four tandem fibronectin type III domains, illustrating domain repetition.
- Domain shuffling is a mechanism by which new multi-domain proteins arise, combining domains from different genes.
- Proteases with multiple domains often have regulated activity tied to the presence of additional domains.

Covalent Modifications and Regulation of Proteins

What is covalent modification?
- Covalent modification is the covalent conjugation of chemical groups onto a protein, typically enzyme-catalyzed, and often reversible by another enzyme.
- In this course, most covalent modifications are treated as enzyme-directed and reversible by specific enzymes (erasers).
Summary of major covalent modifications (conceptual table)
- Phosphorylation (Ser/Thr/Tyr): add phosphate groups via kinases; remove via phosphatases; can drive conformational changes and assembly of large complexes; involved in regulation of receptor tyrosine kinases and many signaling pathways.
- Methylation and acetylation: alter chromatin accessibility and gene expression; especially important for histones and transcriptional regulation.
- Palmitoylation: addition of a fatty acid (palmitoyl group) to promote membrane association.
- GlcNAcylation (N-acetylglucosamine): addition of sugars; involved in various regulatory roles including glucose homeostasis; less common.
- Ubiquitination: conjugation of ubiquitin (a small protein, 76 amino acids; 8.6 kDa) to target proteins; highly versatile and central to signaling and protein turnover.
Ubiquitin: structure and signaling codes
- Ubiquitin can be attached as:
- Monoubiquitination: single ubiquitin on a lysine residue of the substrate; example: histones can be ubiquitinated; regulatory roles beyond degradation.
- Multiubiquitination: multiple ubiquitins attached to several lysines on the substrate.
- Polyubiquitination: chains formed on substrates; ubiquitin itself has lysine residues that can form chains (Lys-48, Lys-63, among others).
- Key linkage types and outcomes:
- K48-linked chains (K48): target proteins to proteasomes for degradation.
- K63-linked chains (K63): generally alter protein activity or promote protein–protein interactions and complex formation; important for DNA repair and other signaling processes.
- Other linkages (K6, K11, K27) exist but are less central in this course.
- Ubiquitin can itself be ubiquitinated; chains can be expanded on ubiquitin molecules.
- Phosphorylation can occur on ubiquitin or ubiquitin chains (e.g., phospho-ubiquitin at Ser65), adding complexity to ubiquitin signaling.
- The ubiquitin code concept: different types of ubiquitination convey distinct cellular outcomes and are recognized by specific proteins.
Enzymatic machinery for ubiquitination
- A three-enzyme cascade is required:
- E1: ubiquitin-activating enzyme
- E2: ubiquitin-conjugating enzyme
- E3: ubiquitin ligase
- This cascade provides specificity for which substrates get ubiquitinated.
Other ubiquitin-like modifiers
- SUMO and other ubiquitin-like proteins can be conjugated to substrates, expanding the regulatory repertoire beyond ubiquitin alone.
Modifications as regulatory logic (Boolean-like behavior)
- Phosphorylation can act as a molecular switch, turning protein activity on/off reversibly.
- Proteins may require multiple modifications at multiple sites for full activity (AND logic), or may be inhibited/activated by different combinations of signals.
p53 as an example of multisite regulation
- p53 can be phosphorylated, ubiquitinated, acetylated, and sumoylated at various sites; different enzymes drive each modification in response to signals.
- The integration of multiple signals allows for nuanced control of p53 activity and cellular outcomes (DNA damage response, cell cycle regulation, apoptosis).
Kinase-regulated example: Sarc protein kinase revisited
- Inactively phosphorylated at a tyrosine in the N-terminal tail; SH2 binding keeps it in a closed, inactive state.
- Dephosphorylation frees SH2 to bind a phosphotyrosine on an activating ligand; SH3 interaction with the ligand partially activates the kinase.
- Autophosphorylation of a Tyr within the protein yields full activation and substrate phosphorylation.
- This illustrates how phosphorylation integrates with domain interactions to regulate activity.
GTP-binding proteins as molecular switches (contrast to phosphorylation)
- GTP-binding proteins switch on when bound to GTP and off when bound to GDP, due to intrinsic GTPase activity.
- GDP dissociation is slow; exchange factors (GEFs) promote GDP release and GTP binding; GTPase-activating proteins (GAPs) accelerate GTP hydrolysis.
- GTPases regulate processes like nuclear import, vesicular trafficking, etc., and can be regulated similarly to phosphorylation, but via a different molecular mechanism.

Protein–Ligand and Protein–Protein Interactions

Protein–ligand selectivity and binding sites
- Enzymes and receptors exhibit high specificity for ligands; binding pockets are shaped to fit particular ligands with complementary chemistry.
- Binding requires multiple noncovalent interactions (e.g., hydrogen bonds, ionic interactions, hydrophobic contacts) between ligand groups and amino acid side chains within the binding site.
- The cumulative strength of many weak noncovalent interactions yields a strong overall binding.
- Water exclusion from binding sites helps stabilize binding by preventing disruption of hydrogen-bond networks; the binding site is effectively kept dry by energetically favoring ligand–site interactions over water–site interactions.
Active sites and catalytic architecture
- Binding sites may rearrange to position reactive residues for catalysis (e.g., a Ser-Asp-His catalytic triad in serine proteases).
- An example catalytic triad: Aspartate, Histidine, and Serine coordinate to facilitate peptide bond hydrolysis in serine proteases.
Allosteric regulation
- Many enzymes have more than one ligand-binding site; binding at one site can modulate binding or activity at another site (allostery).
- Positive regulation: binding of a second molecule (X) enhances substrate binding/activity; example shows a glucose-binding protein with ligand X increasing glucose affinity when X is bound.
- Negative regulation: binding of the second molecule inhibits activity (e.g., product inhibition) to prevent excess production.
- Allosteric regulation is a common way to couple metabolic control and signal integration.
Protein–protein interactions and complex formation
- Proteins interact via complementary surfaces; surface shapes determine compatibility (lock-and-key analogy).
- Interaction surfaces can be:
- Unstructured region inserting into a groove (surface-string interaction).
- Helix–helix interactions (coiled-coils driven by hydrophobic stripes).
- Rigid complementary surfaces that interlock (fits like puzzle pieces).
- Complexes can be static or dynamic; they can rearrange and undergo conformational changes powered by energy sources like ATP hydrolysis, enabling molecular machines.
- Scaffold proteins help organize multiple components by providing multivalent interaction surfaces, increasing the efficiency and likelihood of complex formation.
Functional consequences of protein complexes
- Large enzyme complexes enable substrate channeling, where products pass directly from one enzyme to the next without diffusion into solvent, increasing efficiency.
- Complexes are especially important in signaling cascades, metabolism, and structural assemblies.

Key Takeaways and Exam-Relevant Concepts

Proteins are built from 20 standard amino acids with distinct properties that dictate folding, stability, and function.
The hierarchy of structure (primary, secondary, tertiary, quaternary) is shaped by backbone chemistry, side-chain interactions, and noncovalent forces; covalent modification adds another regulatory layer.
The concept of domains highlights how proteins are built from modular units with discrete folds and functions; domain duplication and shuffling are major evolutionary strategies for generating complexity.
Covalent modifications (phosphorylation, methylation, acetylation, palmitoylation, GlcNAcylation, ubiquitination, etc.) regulate activity, interactions, localization, and turnover; many modifications form complex regulatory codes (e.g., ubiquitin code with K48 and K63 linkages).
GTP-binding proteins function as molecular switches, providing timing and coordination for processes like trafficking and signaling, with GEFs and GAPs controlling their on/off state.
Protein–ligand and protein–protein interactions rely on complementary surfaces and noncovalent bonds; water exclusion and conformational changes are critical for binding and activity.
Modularity and domain architecture enable proteins to participate in large, dynamic networks and complexes, often acting as molecular machines or scaffolds.

Notation and Equations (Quick Reference)

Physiological ionization state of amino acids at pH ≈ 7:
- $\mathrm{NH_3^+}$ on the amino terminus and $\mathrm{COO^-}$ on the carboxyl terminus.
Start codon for methionine in translation:
- $\text{AUG}$ (codes for methionine).
Ubiquitin linkage impact (conceptual):
- $K_{48}\text{-linked}$ chains target proteins to proteasomes for degradation.
- $K_{63}\text{-linked}$ chains modulate activity and promote complex formation and DNA repair processes.
Key domain and interaction terminology
- SH2 domain: binds phosphotyrosine residues.
- SH3 domain: binds proline-rich sequences with hydrophobic residues.

References and Further Reading (Conceptual If You Want to Dive Deeper)

Protein folding and Ramachandran plots (phi/psi angle space)
Structural representations: backbone, ribbon, space-filling, and wireframe models
Examples of domain architectures in Src kinase, Cytochrome b562, LDH NAD-binding domain, and immunoglobulin domains
Mechanisms of domain duplication and domain shuffling in fibronectin and cadherins
Ubiquitin signaling codes and the E1/E2/E3 enzymatic cascade
Allosteric regulation principles across metabolic enzymes

Explanation notes: The content above tracks the material from the lectures on Proteins (Amino acids, structure, domains, covalent modifications, and interactions). It emphasizes the building blocks, how sequence dictates folding, the concept of domains, covalent regulatory mechanisms, and how proteins interact with ligands and other proteins to perform cellular functions. It integrates specific examples (Sarc kinase, Cro repressor, serine proteases, fibronectin, GTPases, p53) to illustrate concepts and highlight common exam topics and foundational principles in protein biochemistry.