Protein Three-Dimensional Structure – Study Notes

Primary Structure

Polypeptides are chains of amino acids linked by peptide bonds; each amino acid in a protein is called a residue.
Polypeptide bond directionality: the amino terminal end is the start of the chain and the carboxyl terminal end is the end.
Primary structure is always written from the amino terminal to the carboxyl terminal (left to right).
The polypeptide consists of a repeating backbone (main chain) and a variable side chain (R group).
The backbone has hydrogen-bonding potential due to carbonyl groups and hydrogens bonded to the amide nitrogen.
Most proteins contain roughly $50$ to $2000$ amino acids ($50 \le N \le 2000$).
The mean molecular weight for an amino acid is approximately $M \approx 110\ \mathrm{g\,mol^{-1}}$.
In some proteins, cross-links can occur via disulfide bonds; disulfide bonds form by the oxidation of two cysteines to form cystine.
The sequence and arrangement of residues along the chain determine the protein’s structure and properties.

Bonding and Backbone Characteristics

The peptide bond is essentially planar.
In the peptide linkage, six atoms lie in one plane: (C\alpha(i), C, O, N, H, C\alpha(i+1)).
The peptide bond has partial double-bond character due to resonance, which restricts rotation around the bond.
The peptide bond is uncharged.

Torsion Angles and Chain Conformation

Rotation is allowed about two bonds:
- the N–C\alpha bond, defining the torsion angle $\phi$ (Phi)
- the C\alpha–carbonyl bond, defining the torsion angle $\psi$ (Psi)
The allowed rotation about these bonds (the torsion angles) determines the path of the polypeptide chain.
Not all combinations of $\phi$ and $\psi$ are permitted; steric constraints limit conformations.

Ramachandran Plots

Ramachandran plots depict allowed and disallowed regions of the $\phi$ (horizontal) and $\psi$ (vertical) angles.
Typical favored regions correspond to common secondary structures; disfavored regions are sterically prohibited for most residues.
The plot illustrates that only a subset of the $(\phi, \psi)$ space is compatible with a viable backbone conformation.

Secondary Structure

Secondary structure is the 3D arrangement formed by hydrogen bonds between peptide N-H and C=O groups of amino acids near one another in sequence.
Major examples include:
- α-helix
- β-sheet
- turns
Hydrogen bonding pattern in α-helix: hydrogen bonds occur between N–H of residue $i$ and C=O of residue $i+4$ (i → i+4); each turn typically contains about 3.6 residues.

α-Helix (Overview)

The α-helix is a right-handed helix in which the backbone is stabilized by intramolecular hydrogen bonds between carbonyl O of residue $i$ and amide N–H of residue $i+4$.
Each turn contains ~$3.6$ residues.
Side chains project outward from the helix axis.
The helical geometry results from favorable packing and hydrogen bonding; deviations can occur due to residues like proline.

β-Sheet

The β-sheet is another common form of secondary structure.
β-strands align side-by-side to form β-sheets; strands can be adjacent and hydrogen-bond to each other.
Hydrogen bonds link the strands; the strands may be parallel, antiparallel, or mixed in orientation.
β-sheets can be flat or adopt a twisted conformation.

Keratin and Collagen: Structural Proteins

α-Keratin is a structural protein found in wool and hair.
- Composed of two right-handed α-helices intertwined to form a left-handed superhelix (a coiled-coil).
- Helices interact via ionic bonds or van der Waals interactions.
- Keratin belongs to the coiled-coil superfamily, which includes some cytoskeletal and muscle proteins.
Collagen is a structural protein found in skin, bone, tendons, cartilage, and teeth.
- Collagen consists of three intertwined helical polypeptide chains forming a superhelical cable (a triple helix).
- The chains are not α-helices; glycine appears at every third residue and the motif Gly-X-Y (commonly Gly-Pro-Pro) is prevalent.

Hydroxyproline and Vitamin C

Hydroxyproline is a post-translationally modified proline (4-hydroxyproline) important for stabilizing collagen's triple helix.
Hydroxylation requires the enzyme prolyl hydroxylase and vitamin C as a cofactor.
Vitamin C deficiency impairs prolyl hydroxylase activity, leading to scarlet or scurvy in humans.
Humans cannot synthesize vitamin C (ascorbic acid) endogenously.

Tertiary Structure

Tertiary structure describes the spatial arrangement of amino acids that are distant in the primary sequence and the pattern of disulfide bonding.
Globular proteins (e.g., myoglobin) form compact, tightly packed structures with little internal void space.
The interior is predominantly hydrophobic amino acids, while the exterior comprises charged and polar residues.

Motifs and Domains

Motifs (supersecondary structures) are combinations of secondary structure elements that recur in many proteins.
Some proteins contain two or more similar or identical compact structures called domains.
Domains often correspond to distinct functional or structural units within a protein.

Quaternary Structure

Many proteins are composed of multiple polypeptide chains, called subunits.
Proteins with multiple subunits display quaternary structure.
Quaternary structure ranges from simple dimers (two identical subunits) to complex assemblies with many different chains.

Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence

Christian Anfinsen studied ribonuclease under denaturing and reducing conditions.
He used urea to disrupt hydrogen bonds and β-mercaptoethanol to reduce disulfide bonds, yielding a random coil with no enzymatic activity—denatured ribonuclease.
When urea and β-mercaptoethanol were slowly removed, the enzyme refolded and regained native structure and activity—renatured ribonuclease.
Conclusion: The information required for a polypeptide to fold into its functional three-dimensional structure is inherent in the amino acid sequence.
This supports the statement: Three-Dimensional Structure is Determined by Amino Acid Sequence.

Role of β-Mercaptoethanol in Disulfide Bond Chemistry

β-mercaptoethanol acts as a reducing agent, breaking disulfide bonds (R–S–S–R’ to two R–SH and R’–SH).
As β-mercaptoethanol is oxidized, it forms disulfide dimers (RS–SR), effectively reducing the protein’s disulfide bonds.
This redox interplay is key to understanding the denaturation and reformation of disulfide bonds during folding experiments.

RNase Folding Experiment: Experimental States (Visual Data)

Experimental conditions often compare several states of ribonuclease under denaturing conditions:
- Native ribonuclease (fully folded and active).
- Denatured, reduced ribonuclease (unfolded with disulfide bonds broken).
- Scrambled ribonuclease (disulfide bonds re-formed in incorrect patterns under oxidizing conditions, often after exposure to urea).
The experimental setup typically involves 8 M urea as a strong chaotropic agent and β-mercaptoethanol as the reducing agent to break disulfide bonds.
Denatured reduced ribonuclease represents the unfolded state with all disulfide bonds broken.
Scrambled ribonuclease results from allowing oxidation to occur in the presence of urea, leading to incorrect disulfide bond formation.
Native ribonuclease is recovered when the denaturing agents are removed in a controlled manner, demonstrating that the primary sequence contains the information necessary for correct folding and disulfide bond formation under appropriate conditions.
Note: The schematic includes references to states labeled with terms such as Native, Denatured, Reduced, and Scrambled, as well as indicators (e.g., presence or absence of disulfide links and the effect of urea and BME) to illustrate the folding pathway and revertibility of RNase.

Primary Structure

Polypeptides are linear chains of amino acids covalently linked by peptide bonds, which form between the carboxyl group of one amino acid and the amino group of another, with the expulsion of a water molecule. Each amino acid within a protein is referred to as a residue.
Polypeptide bond directionality is crucial: the amino-terminal (N-terminal) end, bearing a free amino group, is conventionally considered the start of the chain (and typically written on the left), while the carboxyl-terminal (C-terminal) end, with a free carboxyl group, marks the conclusion (and is written on the right).
Therefore, primary structure is always written and read from the amino terminal to the carboxyl terminal (left to right), reflecting the direction of protein synthesis.
The repeating unit of the polypeptide is the backbone (main chain), consisting of the N, C

, and O atoms shared by all amino acids, and a variable side chain (R group) that defines the unique chemical properties of each amino acid.

The backbone possesses inherent hydrogen-bonding potential due to the presence of electronegative carbonyl oxygen atoms (C=O) and hydrogen atoms bonded to the amide nitrogen (N–H) at regular intervals, facilitating secondary structure formation.
Most proteins are relatively large macromolecules, typically containing roughly $50$ to $2000$ amino acids ( $50 \le N \le 2000$ ) in a single chain.
The mean molecular weight for a typical amino acid residue is approximately $M \approx 110\ \mathrm{g\,mol^{-1}}$ , which can be used to estimate protein molecular weight.
In some proteins, additional covalent cross-links can occur via disulfide bonds. These bonds form by the oxidation of the sulfhydryl groups of two cysteine residues ( $2\, R{-}SH$ ) to form a single cystine residue ( $R{-}S{-}S{-}R$ ), playing a vital role in stabilizing tertiary and quaternary structures, particularly in extracellular proteins.
Ultimately, the precise sequence and arrangement of amino acid residues along the polypeptide chain (primary structure) is the fundamental determinant of the protein
is subsequent higher-order structures (secondary, tertiary, quaternary) and therefore its overall function and properties.

Bonding and Backbone Characteristics

The peptide bond, formed between the carboxyl group of one amino acid and the amino group of another, is essentially planar, meaning the six atoms involved in the bond and its immediate surroundings lie in a single plane. This planarity is a critical feature defining polypeptide backbone geometry.
Specifically, the six atoms that lie in one plane are: the alpha carbons of the preceding ( $C\alpha(i)$ ) and succeeding ( $C\alpha(i+1)$ ) residues, the carbonyl carbon (C), the carbonyl oxygen (O), the amide nitrogen (N), and the amide hydrogen (H).
The peptide bond exhibits partial double-bond character (approximately $40\%$ ) due to resonance between the carbonyl oxygen and the amide nitrogen. This delocalization of electrons restricts free rotation around the C–N peptide bond, making it quite rigid.
The peptide bond is uncharged under physiological pH conditions, which is important for maintaining the overall charge neutrality of the protein backbone, although the N- and C-termini and ionizable side chains contribute to the overall charge.

Torsion Angles and Chain Conformation

While the peptide bond itself is rigid, rotation is freely allowed about two specific bonds within each amino acid residue of the polypeptide backbone:
- The N–C $\alpha$ bond: This rotation defines the torsion angle $\phi$ (Phi).
- The C $\alpha$ –carbonyl carbon bond: This rotation defines the torsion angle $</li></ul></li></ul>\psi$ (Psi).
```

```
 - The specific values of these torsion angles ( $\phi$ and $\psi$ ) for each residue determine the overall three-dimensional path and conformation of the polypeptide chain.
 - Not all combinations of $\phi$ and $\psi$ are sterically permitted. Certain combinations lead to atoms clashing, resulting in unfavorable steric hindrance that limits the number of accessible conformations. Amino acid side chains can also influence these permitted angles.
 Ramachandran Plots
 - Ramachandran plots are graphical representations that depict the statistically allowed and disallowed regions of the $\phi$ (horizontal axis) and $\psi$ (vertical axis) torsion angles for amino acid residues in a polypeptide backbone.
 - Typical favored regions within the plot correspond to specific, recurring arrangements of the polypeptide backbone, such as common secondary structures (e.g., $\alpha$ -helices and eta-sheets). Disfavored regions represent sterically prohibited conformations for most residues, where atoms would be too close, leading to high energy.
 - The plot thereby visually illustrates that only a relatively small subset of the entire ( $\phi$ , $\psi$ ) conformational space is compatible with a viable and stable backbone conformation, guiding the understanding of protein folding principles.
 Secondary Structure
 - Secondary structure refers to the localized, recurring three-dimensional arrangements formed by regularly repeating patterns of hydrogen bonds between the hydrogen atom of the peptide N–H group and the oxygen atom of the peptide C=O group of amino acids that are relatively close to each other in the primary sequence (typically within $3-7$ residues).
 - Major examples of such stable and common secondary structures include:
 - $\alpha$ -helix: A compact, helical arrangement.
 - $</li></ul></li></ul>\beta$ -sheet: An extended, pleated arrangement.
 - Turns and loops: Shorter, less regular structures that connect other secondary elements and typically reverse the direction of the polypeptide chain.
```

```
 - In the $\alpha$ -helix, the characteristic hydrogen bonding pattern involves a hydrogen bond occurring between the N–H group of residue $i$ and the C=O group of residue $i+4$ (i.e., four residues down the chain, $i \rightarrow i+4$ ). Each complete turn of an $\alpha$ -helix typically contains approximately $3.6$ amino acid residues and extends about 5.4\ \mathring{A}} along the helical axis.
 $\alpha$ -Helix (Overview)
 - The $\alpha$ -helix is a prevalent and stable secondary structure, characterized as a rigid, right-handed helical conformation in which the polypeptide backbone is extensively stabilized by intramolecular hydrogen bonds.
 - These hydrogen bonds form precisely between the carbonyl oxygen (C=O) of residue $i$ and the amide hydrogen (N–H) of residue $i+4$ , resulting in a regular, repeating pattern.
 - Each complete turn of the $\alpha$ -helix contains approximately $3.6$ amino acid residues and rises $0.54\ \text{nm}$ (or 5.4\ \mathring{A}}) along the helix axis, with each residue contributing a rise of $0.15\ \text{nm}$ . The optimal geometry for hydrogen bonding is achieved along the axis of the helix.
 - The side chains (R groups) of the amino acids project outwards from the central axis of the helix, minimizing steric clashes with the backbone and neighboring side chains, and allowing them to interact with the solvent or other parts of the protein.
 - The helical geometry arises from the favorable packing of backbone atoms and the optimal formation of hydrogen bonds. However, certain residues can disrupt $\alpha$ -helices: proline introduces a kink because its cyclic structure prevents it from forming an amide hydrogen for bonding and restricts rotation, while glycine is too flexible and often leads to instability.
 - The cumulative effect of the aligned hydrogen bonds imparts a significant dipole moment to the $\alpha$ -helix, with the N-terminus having a partial positive charge and the C-terminus a partial negative charge.
 $</h4>\beta$ -Sheet
 The eta-sheet is another extremely common and stable form of secondary structure, distinguished by its extended, pleated conformation rather than a coiled one.
 eta-strands are short, extended polypeptide segments. Multiple eta-strands align side-by-side to form a eta-sheet. These strands are hydrogen-bonded to each other, forming a rigid, interconnected network.
 Hydrogen bonds link the backbone N–H and C=O groups of adjacent strands. The strands themselves can be arranged in three orientations: parallel (N- to C-termini running in the same direction), antiparallel (N- to C-termini running in opposite directions), or mixed. Antiparallel eta-sheets have more linear and thus stronger hydrogen bonds.
 The eta-sheets are not perfectly flat; they often adopt a characteristic twisted conformation, especially in larger proteins, which provides greater stability and helps accommodate the typically chiral nature of amino acids.
 Keratin and Collagen: Structural Proteins
 - $\alpha$ -Keratin is a prominent fibrous structural protein found in extracellular protective structures such as wool, hair, skin, and nails. It provides strength and flexibility.
 It is primarily composed of two right-handed $\alpha$ -helices that are supercoiled around each other in a left-handed fashion, forming a stable structure known as a coiled-coil. This superhelical twisting further increases its tensile strength.
 The two $\alpha$ -helices interact predominantly via hydrophobic interactions between the nonpolar residues on their interacting surfaces, and also through ionic bonds and van der Waals interactions. The characteristic repeating pattern of hydrophobic residues allows for this tight intertwining.
 Keratin belongs to a large superfamily of coiled-coil proteins, which includes many other important cytoskeletal and muscle proteins (e.g., intermediate filaments, myosin), highlighting a common structural motif for generating strong, pliable filaments.
 - Collagen is the most abundant protein in mammals and a crucial structural protein found in connective tissues such as skin, bone, tendons, cartilage, and teeth, providing high tensile strength and elasticity.
 Unlike $\alpha$ -keratin, collagen consists of three intertwined helical polypeptide chains, each of which is a left-handed helix, which then together form a right-handed superhelical cable (a triple helix). This unique arrangement provides immense strength.
 Individual collagen chains are not $\alpha$ -helices; they possess a distinct extended helical conformation due to their unique amino acid composition. A key feature is the frequent appearance of glycine at every third residue due to space constraints in the core of the triple helix. Furthermore, motifs like Gly-X-Y (where X is often proline and Y is often 4-hydroxyproline) are highly prevalent and essential for maintaining the triple-helical structure.
 Hydroxyproline and Vitamin C
 - Hydroxyproline, specifically 4-hydroxyproline, is a crucial post-translationally modified amino acid derived from proline. Its presence is vital for the structural integrity and stability of collagen's triple helix through the formation of crucial hydrogen bonds between the chains.
 - The enzymatic hydroxylation of proline residues to 4-hydroxyproline is catalyzed by the enzyme prolyl hydroxylase. A critical requirement for this enzyme's activity is the presence of vitamin C (ascorbic acid) as a co-factor, which keeps the iron atom in the enzyme's active site in its reduced state ( $Fe^{2+}$ ).
 - A deficiency in vitamin C severely impairs the activity of prolyl hydroxylase, leading to insufficient hydroxylation of proline residues. This results in the synthesis of unstable collagen that cannot properly form strong triple helices, manifesting as the debilitating symptoms of scurvy in humans, characterized by fragile blood vessels, poor wound healing, and weakened connective tissues.
 - Humans, unlike most other mammals, lack the enzyme gulonolactone oxidase, which is necessary for the endogenous synthesis of vitamin C. Therefore, vitamin C (ascorbic acid) must be obtained regularly through dietary intake, making it an essential nutrient.
 Tertiary Structure
 - Tertiary structure describes the overall three-dimensional spatial arrangement and folding of an entire single polypeptide chain, encompassing how secondary structure elements (helices, sheets) pack together, as well as the spatial relationships of amino acid residues that are far apart in the primary sequence.
 - This level of structure is stabilized by various non-covalent interactions and, importantly, by covalent disulfide bonds. Key interactions include:
 Hydrophobic interactions: Nonpolar amino acids tend to cluster in the interior of the protein, away from the aqueous environment.
 Ionic bonds (salt bridges): Electrostatic attractions between oppositely charged amino acid side chains (e.g., lysine and aspartate).
 Hydrogen bonds: Between polar side chains or between polar side chains and the polypeptide backbone.
 Van der Waals forces: Weak, transient attractive forces between all atoms.
 Disulfide bonds: Covalent bonds between two cysteine residues. These are particularly strong stabilizers and are common in secreted or extracellular proteins.
 - Globular proteins, a primary class of proteins, typically fold into compact, tightly packed structures with minimal internal void space, which contributes to their stability and defined shape.
 - A recurring theme in soluble globular proteins is that their interior is predominantly composed of hydrophobic amino acid residues, sequestered from water. Conversely, the exterior surface is largely made up of charged and polar residues, allowing for favorable interactions with the surrounding aqueous cellular environment.
 Motifs and Domains
 - Motifs, also known as supersecondary structures (or folds), are recurring combinations of several secondary structure elements (e.g., $\alpha$ -helices and $</li></ul>\beta$ -sheets) that form a distinct, recognizable structural unit that often has a specific function. Examples include the helix-loop-helix, the eta-hairpin, and the Rossmann fold.
 Some proteins are large enough to contain two or more distinct, independently folding compact structures called domains. These domains often correspond to distinct functional or structural units within a single polypeptide chain, and they can sometimes fold, function, or even evolve somewhat independently.
 Domains are typically connected by flexible linker regions and can contribute to different aspects of a protein's overall function, such as binding, catalytic activity, or regulatory interactions.
 Quaternary Structure
 Many functional proteins are not composed of a single polypeptide chain but are rather assemblies of multiple polypeptide chains, referred to as subunits. These proteins are known as oligomeric proteins.
 Proteins with multiple subunits display quaternary structure, which describes the spatial arrangement of these individual polypeptide subunits relative to one another.
 Quaternary structures can range widely in complexity, from simple dimmers (composed of two often identical subunits) to complex assemblies with many different types of chains (e.g., hemoglobin, with four subunits, or viral capsids, with hundreds).
 The subunits are typically held together by non-covalent interactions, including hydrophobic interactions, hydrogen bonds, and ionic bonds, at their interfaces. Disulfide bonds can also sometimes link subunits.
 This level of organization is crucial for many protein functions, including cooperative binding, allosteric regulation, and the formation of large enzymatic complexes.
 Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence
 Christian Anfinsen conducted groundbreaking experiments on the enzyme ribonuclease A (RNase A), a monomeric protein with four disulfide bonds, to demonstrate that the information required for protein folding is intrinsic to its amino acid sequence.
 He denatured ribonuclease by treating it with strong chaotropic agents like urea (8 M) to disrupt non-covalent bonds (hydrogen bonds, hydrophobic interactions) and $</li></ul>\beta$ -mercaptoethanol (BME) to reduce and break its four disulfide bonds. This treatment resulted in a completely unfolded, random coil polypeptide chain that had lost all enzymatic activity—denatured, reduced ribonuclease.
 Crucially, when urea and $</li></ul>\beta$ -mercaptoethanol were slowly and sequentially removed (first BME, allowing disulfide bonds to reform, then urea for refolding), the denatured enzyme spontaneously refolded. It regained its native three-dimensional structure and, remarkably, full enzymatic activity. This molecule was termed renatured ribonuclease.
 Conclusion: Anfinsen's findings definitively proved that the elaborate three-dimensional structure necessary for a polypeptide's function is entirely encoded within its primary amino acid sequence. This suggests that protein folding is a thermodynamic process directed towards the most stable conformation.
 This experiment strongly supports the fundamental principle: Three-Dimensional Structure is Determined by Amino Acid Sequence.
 Role of $</h4>\beta$ -Mercaptoethanol in Disulfide Bond Chemistry
 $</li></ul>\beta$ -mercaptoethanol (BME) is a commonly used reducing agent in protein chemistry. Its primary role is to cleave disulfide bonds (R–S–S–R’) by reducing them back into two free sulfhydryl (thiol) groups (R–SH and R’–SH).
 The mechanism involves a thiol-disulfide exchange reaction: The sulfhydryl group of BME donates electrons to break the disulfide bond in the protein. In doing so, BME itself becomes oxidized, forming an intramolecular disulfide bond, typically a stable cyclic disulfide dimer ( $RS{-}SR$ ) when in excess. This drives the equilibrium towards the reduction of the protein's disulfide bonds.
 This redox interplay is absolutely key to understanding the denaturation and reformation of disulfide bonds during folding experiments, as seen in Anfinsen's work, allowing researchers to manipulate the protein's disulfide status.
 RNase Folding Experiment: Experimental States (Visual Data)
 The Anfinsen experiment typically compares several distinct states of ribonuclease to illustrate the principles of protein folding and stability:
 Native ribonuclease: This is the fully folded, biologically active enzyme with its characteristic tertiary structure, including four correctly formed disulfide bonds. It serves as the baseline for activity and structure.
 Denatured, reduced ribonuclease: This state is achieved by treating native RNase with 8 M urea (a strong chaotropic agent that disrupts non-covalent interactions and unfolds the protein) and $</li></ul></li></ul>\beta$ -mercaptoethanol (a reducing agent that breaks all four disulfide bonds). In this state, the protein is an unfolded, random coil with no enzymatic activity and no disulfide links.
 - Scrambled ribonuclease: This state results when the denatured, reduced RNase is allowed to re-oxidize (i.e., disulfide bonds reform) while still in the presence of 8 M urea. Because the polypeptide is unfolded in urea, the disulfide bonds form randomly, resulting in many incorrect pairings. This scrambled protein has very little, if any, enzymatic activity, demonstrating that merely having disulfide bonds is not enough; they must be correctly formed.
 
 The experimental setup typically involves the use of 8 M urea as a strong chaotropic agent to fully unfold the protein and $</li></ul>\beta$ -mercaptoethanol as the reducing agent to break existing disulfide bonds.
 The denatured, reduced ribonuclease specifically represents the completely unfolded state where all stabilizing non-covalent interactions are disrupted, and all disulfide bonds are broken.
 The scrambled ribonuclease state is critical for showing that the correct pathway for disulfide bond formation is essential for native function, and that random formation in an unfolded state leads to a non-functional product.
 Native ribonuclease is recovered when the denaturing agents (urea and BME) are removed in a controlled and sequential manner (BME first, then urea). This controlled removal allows the polypeptide chain to first establish its correct non-covalent interactions (guided by the primary sequence) and then correctly form its disulfide bonds, demonstrating that the primary sequence contains all the necessary information for proper folding and disulfide bond formation under appropriate conditions. Without the correct conditions, even if disulfide bonds form, they may be incorrect.
 Primary Structure
 Polypeptides are linear chains of amino acids covalently linked by peptide bonds, which form between the carboxyl group of one amino acid and the amino group of another, with the expulsion of a water molecule. Each amino acid within a protein is referred to as a residue.
 Polypeptide bond directionality is crucial: the amino-terminal (N-terminal) end, bearing a free amino group, is conventionally considered the start of the chain (and typically written on the left), while the carboxyl-terminal (C-terminal) end, with a free carboxyl group, marks the conclusion (and is written on the right).
 Therefore, primary structure is always written and read from the amino terminal to the carboxyl terminal (left to right), reflecting the direction of protein synthesis.
 The repeating unit of the polypeptide is the backbone (main chain), consisting of the N, C $\alpha$ , and C (carbonyl) atoms shared by all amino acids, and a variable side chain (R group) that defines the unique chemical properties of each amino acid.
 The backbone possesses inherent hydrogen-bonding potential due to the presence of electronegative carbonyl oxygen atoms (C=O) and hydrogen atoms bonded to the amide nitrogen (N–H) at regular intervals, facilitating secondary structure formation.
 Most proteins are relatively large macromolecules, typically containing roughly $50$ to $2000$ amino acids ( $50 \le N \le 2000$ ) in a single chain.
 The mean molecular weight for a typical amino acid residue is approximately $M \approx 110\ \mathrm{g\,mol^{-1}}$ , which can be used to estimate protein molecular weight.
 In some proteins, additional covalent cross-links can occur via disulfide bonds. These bonds form by the oxidation of the sulfhydryl groups of two cysteine residues ( $2\, R{-}SH$ ) to form a single cystine residue ( $R{-}S{-}S{-}R$ ), playing a vital role in stabilizing tertiary and quaternary structures, particularly in extracellular proteins.
 Ultimately, the precise sequence and arrangement of amino acid residues along the polypeptide chain (primary structure) is the fundamental determinant of the protein's subsequent higher-order structures (secondary, tertiary, quaternary) and therefore its overall function and properties.
 Bonding and Backbone Characteristics
 The peptide bond, formed between the carboxyl group of one amino acid and the amino group of another, is essentially planar, meaning the six atoms involved in the bond and its immediate surroundings lie in a single plane. This planarity is a critical feature defining polypeptide backbone geometry.
 Specifically, the six atoms that lie in one plane are: the alpha carbons of the preceding ( $C\alpha(i)$ ) and succeeding ( $C\alpha(i+1)$ ) residues, the carbonyl carbon (C), the carbonyl oxygen (O), the amide nitrogen (N), and the amide hydrogen (H).
 The peptide bond exhibits partial double-bond character (approximately $40\%$ ) due to resonance between the carbonyl oxygen and the amide nitrogen. This delocalization of electrons restricts free rotation around the C–N peptide bond, making it quite rigid.
 The peptide bond is uncharged under physiological pH conditions, which is important for maintaining the overall charge neutrality of the protein backbone, although the N- and C-termini and ionizable side chains contribute to the overall charge.
 Torsion Angles and Chain Conformation
 While the peptide bond itself is rigid, rotation is freely allowed about two specific bonds within each amino acid residue of the polypeptide backbone:
 The N–C $\alpha$ bond: This rotation defines the torsion angle $\phi$ (Phi).
 The C $\alpha$ –carbonyl carbon bond: This rotation defines the torsion angle $\psi$ (Psi).
 The specific values of these torsion angles ( $\phi$ and $\psi$ ) for each residue determine the overall three-dimensional path and conformation of the polypeptide chain.
 Not all combinations of $\phi$ and $\psi$ are sterically permitted. Certain combinations lead to atoms clashing, resulting in unfavorable steric hindrance that limits the number of accessible conformations. Amino acid side chains can also influence these permitted angles.
 Ramachandran Plots
 Ramachandran plots are graphical representations that depict the statistically allowed and disallowed regions of the $\phi$ (horizontal axis) and $\psi$ (vertical axis) torsion angles for amino acid residues in a polypeptide backbone.
 Typical favored regions within the plot correspond to specific, recurring arrangements of the polypeptide backbone, such as common secondary structures (e.g., $\alpha$ -helices and $\beta$ -sheets). Disfavored regions represent sterically prohibited conformations for most residues, where atoms would be too close, leading to high energy.
 The plot thereby visually illustrates that only a relatively small subset of the entire ( $\phi$ , $\psi$ ) conformational space is compatible with a viable and stable backbone conformation, guiding the understanding of protein folding principles.
 Secondary Structure
 Secondary structure refers to the localized, recurring three-dimensional arrangements formed by regularly repeating patterns of hydrogen bonds between the hydrogen atom of the peptide N–H group and the oxygen atom of the peptide C=O group of amino acids that are relatively close to each other in the primary sequence (typically within $3-7$ residues).
 Major examples of such stable and common secondary structures include:
 $\alpha$ -helix: A compact, helical arrangement.
 $\beta$ -sheet: An extended, pleated arrangement.
 Turns and loops: Shorter, less regular structures that connect other secondary elements and typically reverse the direction of the polypeptide chain.
 In the $\alpha$ -helix, the characteristic hydrogen bonding pattern involves a hydrogen bond occurring between the N–H group of residue $i$ and the C=O group of residue $i+4$ (i.e., four residues down the chain, $i \rightarrow i+4$ ). Each complete turn of an $\alpha$ -helix typically contains approximately $3.6$ amino acid residues and extends about 5.4\ \mathring{A}} along the helical axis.
 $\alpha$ -Helix (Overview)
 The $\alpha$ -helix is a prevalent and stable secondary structure, characterized as a rigid, right-handed helical conformation in which the polypeptide backbone is extensively stabilized by intramolecular hydrogen bonds.
 These hydrogen bonds form precisely between the carbonyl oxygen (C=O) of residue $i$ and the amide hydrogen (N–H) of residue $i+4$ , resulting in a regular, repeating pattern.
 Each complete turn of the $\alpha$ -helix contains approximately $3.6$ amino acid residues and rises 0.54\ \text{nm}} (or 5.4\ \mathring{A}}) along the helix axis, with each residue contributing a rise of $0.15\ \text{nm}$ . The optimal geometry for hydrogen bonding is achieved along the axis of the helix.
 The side chains (R groups) of the amino acids project outwards from the central axis of the helix, minimizing steric clashes with the backbone and neighboring side chains, and allowing them to interact with the solvent or other parts of the protein.
 The helical geometry arises from the favorable packing of backbone atoms and the optimal formation of hydrogen bonds. However, certain residues can disrupt $\alpha$ -helices: proline introduces a kink because its cyclic structure prevents it from forming an amide hydrogen for bonding and restricts rotation, while glycine is too flexible and often leads to instability.
 The cumulative effect of the aligned hydrogen bonds imparts a significant dipole moment to the $\alpha$ -helix, with the N-terminus having a partial positive charge and the C-terminus a partial negative charge.
 $\beta$ -Sheet
 The $\beta$ -sheet is another extremely common and stable form of secondary structure, distinguished by its extended, pleated conformation rather than a coiled one.
 $\beta$ -strands are short, extended polypeptide segments. Multiple $\beta$ -strands align side-by-side to form a $\beta$ -sheet. These strands are hydrogen-bonded to each other, forming a rigid, interconnected network.
 Hydrogen bonds link the backbone N–H and C=O groups of adjacent strands. The strands themselves can be arranged in three orientations: parallel (N- to C-termini running in the same direction), antiparallel (N- to C-termini running in opposite directions), or mixed. Antiparallel $\beta$ -sheets have more linear and thus stronger hydrogen bonds.
 The $\beta$ -sheets are not perfectly flat; they often adopt a characteristic twisted conformation, especially in larger proteins, which provides greater stability and helps accommodate the typically chiral nature of amino acids.
 Keratin and Collagen: Structural Proteins
 $\alpha$ -Keratin is a prominent fibrous structural protein found in extracellular protective structures such as wool, hair, skin, and nails. It provides strength and flexibility.
 It is primarily composed of two right-handed $\alpha$ -helices that are supercoiled around each other in a left-handed fashion, forming a stable structure known as a coiled-coil. This superhelical twisting further increases its tensile strength.
 The two $\alpha$ -helices interact predominantly via hydrophobic interactions between the nonpolar residues on their interacting surfaces, and also through ionic bonds and van der Waals interactions. The characteristic repeating pattern of hydrophobic residues allows for this tight intertwining.
 Keratin belongs to a large superfamily of coiled-coil proteins, which includes many other important cytoskeletal and muscle proteins (e.g., intermediate filaments, myosin), highlighting a common structural motif for generating strong, pliable filaments.
 Collagen is the most abundant protein in mammals and a crucial structural protein found in connective tissues such as skin, bone, tendons, cartilage, and teeth, providing high tensile strength and elasticity.
 Unlike $\alpha$ -keratin, collagen consists of three intertwined helical polypeptide chains, each of which is a left-handed helix, which then together form a right-handed superhelical cable (a triple helix). This unique arrangement provides immense strength.
 Individual collagen chains are not $\alpha$ -helices; they possess a distinct extended helical conformation due to their unique amino acid composition. A key feature is the frequent appearance of glycine at every third residue due to space constraints in the core of the triple helix. Furthermore, motifs like Gly-X-Y (where X is often proline and Y is often 4-hydroxyproline) are highly prevalent and essential for maintaining the triple-helical structure.
 Hydroxyproline and Vitamin C
 Hydroxyproline, specifically 4-hydroxyproline, is a crucial post-translationally modified amino acid derived from proline. Its presence is vital for the structural integrity and stability of collagen's triple helix through the formation of crucial hydrogen bonds between the chains.
 The enzymatic hydroxylation of proline residues to 4-hydroxyproline is catalyzed by the enzyme prolyl hydroxylase. A critical requirement for this enzyme's activity is the presence of vitamin C (ascorbic acid) as a co-factor, which keeps the iron atom in the enzyme's active site in its reduced state ( $Fe^{2+}$ ).
 A deficiency in vitamin C severely impairs the activity of prolyl hydroxylase, leading to insufficient hydroxylation of proline residues. This results in the synthesis of unstable collagen that cannot properly form strong triple helices, manifesting as the debilitating symptoms of scurvy in humans, characterized by fragile blood vessels, poor wound healing, and weakened connective tissues.
 Humans, unlike most other mammals, lack the enzyme gulonolactone oxidase, which is necessary for the endogenous synthesis of vitamin C. Therefore, vitamin C (ascorbic acid) must be obtained regularly through dietary intake, making it an essential nutrient.
 Tertiary Structure
 Tertiary structure describes the overall three-dimensional spatial arrangement and folding of an entire single polypeptide chain, encompassing how secondary structure elements (helices, sheets) pack together, as well as the spatial relationships of amino acid residues that are far apart in the primary sequence.
 This level of structure is stabilized by various non-covalent interactions and, importantly, by covalent disulfide bonds. Key interactions include:
 Hydrophobic interactions: Nonpolar amino acids tend to cluster in the interior of the protein, away from the aqueous environment.
 Ionic bonds (salt bridges): Electrostatic attractions between oppositely charged amino acid side chains (e.g., lysine and aspartate).
 Hydrogen bonds: Between polar side chains or between polar side chains and the polypeptide backbone.
 Van der Waals forces: Weak, transient attractive forces between all atoms.
 Disulfide bonds: Covalent bonds between two cysteine residues. These are particularly strong stabilizers and are common in secreted or extracellular proteins.
 Globular proteins, a primary class of proteins, typically fold into compact, tightly packed structures with minimal internal void space, which contributes to their stability and defined shape.
 A recurring theme in soluble globular proteins is that their interior is predominantly composed of hydrophobic amino acid residues, sequestered from water. Conversely, the exterior surface is largely made up of charged and polar residues, allowing for favorable interactions with the surrounding aqueous cellular environment.
 Motifs and Domains
 Motifs, also known as supersecondary structures (or folds), are recurring combinations of several secondary structure elements (e.g., $\alpha$ -helices and $\beta$ -sheets) that form a distinct, recognizable structural unit that often has a specific function. Examples include the helix-loop-helix, the $\beta$ -hairpin, and the Rossmann fold.
 Some proteins are large enough to contain two or more distinct, independently folding compact structures called domains. These domains often correspond to distinct functional or structural units within a single polypeptide chain, and they can sometimes fold, function, or even evolve somewhat independently.
 Domains are typically connected by flexible linker regions and can contribute to different aspects of a protein's overall function, such as binding, catalytic activity, or regulatory interactions.
 Quaternary Structure
 Many functional proteins are not composed of a single polypeptide chain but are rather assemblies of multiple polypeptide chains, referred to as subunits. These proteins are known as oligomeric proteins.
 Proteins with multiple subunits display quaternary structure, which describes the spatial arrangement of these individual polypeptide subunits relative to one another.
 Quaternary structures can range widely in complexity, from simple dimers (composed of two often identical subunits) to complex assemblies with many different types of chains (e.g., hemoglobin, with four subunits, or viral capsids, with hundreds).
 The subunits are typically held together by non-covalent interactions, including hydrophobic interactions, hydrogen bonds, and ionic bonds, at their interfaces. Disulfide bonds can also sometimes link subunits.
 This level of organization is crucial for many protein functions, including cooperative binding, allosteric regulation, and the formation of large enzymatic complexes.
 Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence
 Christian Anfinsen conducted groundbreaking experiments on the enzyme ribonuclease A (RNase A), a monomeric protein with four disulfide bonds, to demonstrate that the information required for protein folding is intrinsic to its amino acid sequence.
 He denatured ribonuclease by treating it with strong chaotropic agents like urea (8 M) to disrupt non-covalent bonds (hydrogen bonds, hydrophobic interactions) and $\beta$ -mercaptoethanol (BME) to reduce and break its four disulfide bonds. This treatment resulted in a completely unfolded, random coil polypeptide chain that had lost all enzymatic activity—denatured, reduced ribonuclease.
 Crucially, when urea and $\beta$ -mercaptoethanol were slowly and sequentially removed (first BME, allowing disulfide bonds to reform, then urea for refolding), the denatured enzyme spontaneously refolded. It regained its native three-dimensional structure and, remarkably, full enzymatic activity. This molecule was termed renatured ribonuclease.
 Conclusion: Anfinsen's findings definitively proved that the elaborate three-dimensional structure necessary for a polypeptide's function is entirely encoded within its primary amino acid sequence. This suggests that protein folding is a thermodynamic process directed towards the most stable conformation.
 This experiment strongly supports the fundamental principle: Three-Dimensional Structure is Determined by Amino Acid Sequence.
 Role of $\beta$ -Mercaptoethanol in Disulfide Bond Chemistry
 $\beta$ -mercaptoethanol (BME) is a commonly used reducing agent in protein chemistry. Its primary role is to cleave disulfide bonds (R–S–S–R') by reducing them back into two free sulfhydryl (thiol) groups (R–SH and R'–SH).
 The mechanism involves a thiol-disulfide exchange reaction: The sulfhydryl group of BME donates electrons to break the disulfide bond in the protein. In doing so, BME itself becomes oxidized, forming an intramolecular disulfide bond, typically a stable cyclic disulfide dimer ( $RS{-}SR$ ) when in excess. This drives the equilibrium towards the reduction of the protein's disulfide bonds.
 This redox interplay is absolutely key to understanding the denaturation and reformation of disulfide bonds during folding experiments, as seen in Anfinsen's work, allowing researchers to manipulate the protein's disulfide status.
 RNase Folding Experiment: Experimental States (Visual Data)
 The Anfinsen experiment typically compares several distinct states of ribonuclease to illustrate the principles of protein folding and stability:
 Native ribonuclease: This is the fully folded, biologically active enzyme with its characteristic tertiary structure, including four correctly formed disulfide bonds. It serves as the baseline for activity and structure.
 Denatured, reduced ribonuclease: This state is achieved by treating native RNase with 8 M urea (a strong chaotropic agent that disrupts non-covalent interactions and unfolds the protein) and $\beta$ -mercaptoethanol (a reducing agent that breaks all four disulfide bonds). In this state, the protein is an unfolded, random coil with no enzymatic activity and no disulfide links.
 Scrambled ribonuclease: This state results when the denatured, reduced RNase is allowed to re-oxidize (i.e., disulfide bonds reform) while still in the presence of 8 M urea. Because the polypeptide is unfolded in urea, the disulfide bonds form randomly, resulting in many incorrect pairings. This scrambled protein has very little, if any, enzymatic activity, demonstrating that merely having disulfide bonds is not enough; they must be correctly formed.
 The experimental setup typically involves the use of 8 M urea as a strong chaotropic agent to fully unfold the protein and $\beta$ -mercaptoethanol as the reducing agent to break existing disulfide bonds.
 The denatured, reduced ribonuclease specifically represents the completely unfolded state where all stabilizing non-covalent interactions are disrupted, and all disulfide bonds are broken.
 The scrambled ribonuclease state is critical for showing that the correct pathway for disulfide bond formation is essential for native function, and that random formation in an unfolded state leads to a non-functional product.
 Native ribonuclease is recovered when the denaturing agents (urea and BME) are removed in a controlled and sequential manner (BME first, then urea). This controlled removal allows the polypeptide chain to first establish its correct non-covalent interactions (guided by the primary sequence) and then correctly form its disulfide bonds, demonstrating that the primary sequence contains all the necessary information for proper folding and disulfide bond formation under appropriate conditions. Without the correct conditions, even if disulfide bonds form, they may be incorrect.

Protein Three-Dimensional Structure – Study Notes

Primary Structure

Bonding and Backbone Characteristics

Torsion Angles and Chain Conformation

Ramachandran Plots

Secondary Structure

α-Helix (Overview)

β-Sheet

Keratin and Collagen: Structural Proteins

Hydroxyproline and Vitamin C

Tertiary Structure

Motifs and Domains

Quaternary Structure

Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence

Role of β-Mercaptoethanol in Disulfide Bond Chemistry

RNase Folding Experiment: Experimental States (Visual Data)

Primary Structure

Bonding and Backbone Characteristics

Torsion Angles and Chain Conformation

Ramachandran Plots

Secondary Structure

α\alphaα-Helix (Overview)

Keratin and Collagen: Structural Proteins

Hydroxyproline and Vitamin C

Tertiary Structure

Motifs and Domains

Quaternary Structure

Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence

RNase Folding Experiment: Experimental States (Visual Data)

Primary Structure

Bonding and Backbone Characteristics

Torsion Angles and Chain Conformation

Ramachandran Plots

Secondary Structure

α\alphaα-Helix (Overview)

β\betaβ-Sheet

Keratin and Collagen: Structural Proteins

Hydroxyproline and Vitamin C

Tertiary Structure

Motifs and Domains

Quaternary Structure

Anfinsen’s Demonstration: Protein Folding Is Encoded in the Primary Sequence

Role of β\betaβ-Mercaptoethanol in Disulfide Bond Chemistry

RNase Folding Experiment: Experimental States (Visual Data)

$\alpha$ -Helix (Overview)

$\alpha$ -Helix (Overview)

$\beta$ -Sheet

Role of $\beta$ -Mercaptoethanol in Disulfide Bond Chemistry