Protein Structure and Function

The Big Picture: Molecular Model of Hemoglobin

This chapter is part of a larger discussion; refer to pages 138-139 for context.
The central question is: What type of molecule was responsible for the origin of life?
Early Earth simulations, like those by Stanley Miller, found recurring amino acids which suggested they were abundant during chemical evolution.
Amino acids are the building blocks of proteins.
Proteins are vital and versatile components of today's cells.

Protein Structure and Function

Proteins are the most abundant and versatile macromolecules in life.
Protein structure determines protein function.
- Primary, secondary, tertiary, and quaternary structures.
Proteins have diverse roles in living cells.
Proteins are composed of 20 amino acids with unique side chains.
Amino acids polymerize to form informs.

Amino Acids and Their Polymerization

Modern cells produce tens of thousands of distinct proteins.
Most are composed of just 20 different building blocks: amino acids.
All 20 amino acids share a common core structure.

Structure of Amino Acids

Carbon atoms have a valence of four, forming up to four covalent bonds.
In all 20 amino acids, a central carbon atom (α-carbon) bonds covalently to:
- H: a hydrogen atom
- $NH_2$ : an amino functional group
- COOH: a carboxyl functional group
- A distinctive “R-group” (side chain)
The combination of amino and carboxyl functional groups is key to their behavior.
In water (pH 7), amino acids ionize.
- The amino group acts as a base, attracting a proton to form $NH_3^+$ .
- The carboxyl group acts as an acid, losing a proton to form COO-.
Charges on these functional groups are important because:
- They help amino acids stay in solution.
- They affect the amino acid’s chemical reactivity.

Nature of Side Chains

The R-group (side chain) makes each of the 20 amino acids unique.
R-groups vary from a single hydrogen atom to large structures containing carbon atoms linked into rings.
Properties of amino acids vary because their R-groups vary.
Amino acids have a three-letter code and a single-letter code.
- Three-letter code uses the first letters of the name (e.g., Ala for alanine).
- Single-letter code was devised by Margaret Oakley Dayhoff.
  - In some cases, the first letter is used (A for alanine).
  - In other cases, it may be the second letter (R for arginine), a phonetically similar letter (F for phenylalanine), or an easy to remember mnemonic (K for lysine).
Dayhoff’s system enabled early computers to analyze protein sequences.
She was a founder of bioinformatics.

Functional Groups Affect Reactivity

Several side chains in amino acids contain carboxyl, sulfhydryl, hydroxyl, or amino functional groups that can participate in chemical reactions under the right conditions.
Amino acids with a sulfhydryl group (SH) can form disulfide (S-S) bonds, linking different parts of large proteins.
- Curly hair contains many cross-links; straight hair, far fewer.
Some amino acids contain side chains devoid of functional groups (solely carbon and hydrogen atoms).
- Their influence on protein function depends primarily on their size and shape rather than reactivity.

Polarity and Charge of R-Groups Affect Solubility

The nature of the R-group affects the solubility of an amino acid in the aqueous interior of the cell.
- Polar and electrically charged R-groups interact readily with water and are hydrophilic.
- Nonpolar R-groups lack charged or highly electronegative atoms and are hydrophobic.
  - Hydrophobic R-groups tend to coalesce in aqueous solution.

Grouping Amino Acid R-Groups

Amino acid R-groups can be grouped into charged (acidic and basic), uncharged polar, and nonpolar types.
Determining the type of amino acid involves answering these questions:
1. Does the R-group have a negative charge? If so, it is acidic.
2. Does the R-group have a positive charge? If so, it is basic.
3. If the R-group is uncharged, does it have an oxygen atom? If so, it is uncharged polar.
If the R-group does not have a negative charge, a positive charge, or an oxygen atom, it is nonpolar.

Linking Amino Acids to Form Proteins

Amino acids link to one another to form proteins.
Proteins are macromolecules (large molecules made up of smaller molecular subunits).
A monomer is a molecular subunit used to build a macromolecule.
A polymer results when a large number of monomers are bonded together.
Polymerization is the process of linking monomers together.
Amino acids are the monomers that polymerize to form proteins.
Nucleic acids and carbohydrates are also polymers.
The theory of chemical evolution states that monomers polymerized to form macromolecules.
According to the second law of thermodynamics, a pool of free monomers would not spontaneously self-assemble into a polymer.
- Polymerization decreases disorder (entropy).
Linking monomers requires an input of energy to offset the reduction in entropy and allow the reaction to become spontaneous.
Early Earth had chemical energy and was constantly bombarded with photons and lightning.

Polymerization of Proteins in Early Earth

Monomers polymerize through condensation reactions (dehydration reactions), which result in the loss of a water molecule.
Hydrolysis breaks polymers apart by adding a water molecule.
In the prebiotic soup model, condensation and hydrolysis represent the forward and reverse reactions of a chemical equilibrium:
- Monomer 1 + Monomer 2 ⇌ Monomer 1-Monomer 2 + H2O
Hydrolysis dominates because it increases entropy and is energetically favorable.
Polymerization would occur only if there were a very high concentration of amino acids to push the reaction toward condensation.
Even under concentrated conditions, a polymer is unlikely to have grown much beyond a short chain.
Recent experiments suggest several ways amino acids could have polymerized early in chemical evolution:
- Mixing free amino acids with a source of chemical energy and tiny mineral particles generates stable polymers; macromolecules adsorb to a mineral surface, protecting them from hydrolysis.
- In hot, metal-rich environments of undersea volcanoes, amino acids have been observed to form and polymerize.
- Amino acids have joined into polymers in cooler water if an energy-rich carbon- and sulfur-containing gas is present.

Peptide Bond

Amino acids polymerize when a bond forms between the carboxyl group of one amino acid and the amino group of another.
The C-N covalent bond is called a peptide bond.
The carboxyl group is converted to a carbonyl functional group (C=O).
The amino group becomes simply N-H in the resulting polymer.
When amino acids are linked by peptide bonds into a chain, they are referred to as “residues”.

Key Points About the Peptide-Bonded Backbone

1.  R-group orientation: Side chains extend out from the backbone, interacting with each other and with water.
2.  Directionality: There is an amino group (NH3+) on one end (N-terminus) and a carboxyl group (COO-) on the other (C-terminus).
    *   Biologists write amino acid residue sequences from the N-terminus to the C-terminus.
3.  Flexibility: The single bonds on either side of the peptide bond can rotate, making the structure flexible.

An oligopeptide contains fewer than 50 amino acids.
Polypeptides contain 50 or more amino acids.
The term “protein” refers to the complete, often functional, form of the molecule.
Some proteins consist of a single polypeptide; others contain two or more.

What Do Proteins Look Like?

Structure gives rise to function.
Proteins have unparalleled diversity in functional roles.
The variability in protein size and shape, and in the chemical properties of their amino acid residues, is responsible for the diverse functions that proteins perform in cells.
Proteins that provide structural support (e.g., collagen) often form long, cable-like fibers.
The shape of some molecules has a clear correlation with its function:
- TATA box–binding protein has a groove where a molecule of DNA fits;
- Porin has a hole that forms a pore.
Each protein is made by cells to perform specific tasks.
The diversity of protein size and shape can be categorized into four basic levels of organization.

Primary Structure

Each protein has a unique sequence of amino acids.
Determined by Frederick Sanger and co-workers during the 1940s and 1950s.
Biochemists refer to the unique sequence of amino acids in a protein as its primary structure.
With 20 types of amino acids available and chain lengths of up to tens of thousands of amino acid residues, the number of primary structures that are possible is practically limitless.
Order and type affects chemical reactivity and solubility.
The order of the R-groups present in a polypeptide will affect that molecule’s properties and function.
Even a single change in the sequence of amino acids can cause striking changes in the way the protein as a whole behaves (e.g. hemoglobin and sickle-cell disease).

Secondary Structure

Created by interactions between functional groups in the backbone.
Hydrogen bonds can form between amino groups and carboxyl groups within the same molecule.
Two main types of structures can form:
1. α-helix: The polypeptide’s backbone is coiled.
2. β-pleated sheet: Segments of a peptide chain bend 180° and then fold in the same plane.
Determined by the geometry and properties of the amino acids in the sequence.
Certain amino acids are more likely to be involved in α-helices than in β-pleated sheets.
Proline is rarely found in α-helices due to its unusual R-group.
A large number of hydrogen bonds increases the stability of the molecule as a whole and helps define its shape.

Tertiary Structure

A protein’s distinctive three-dimensional shape, or tertiary structure, results from interactions between residues that are brought together as the chain bends and folds in space.
There are five types of interactions involving R-groups:
1. Hydrogen bonding: Forms between polar side chains.
2. Hydrophobic interactions: Water molecules interact with hydrophilic polar side chains.
3. van der Waals interactions: Electrical attractions may stabilize the structure.
4. Covalent bonding: Can form between the side chains of two cysteines through a reaction between the sulfhydryl groups.
5. Ionic bonding: May form between groups that have full and opposing charges.

Quaternary Structure

The combining of polypeptides, then referred to as subunits, gives some proteins quaternary structure.
The subunits are held together by the same types of bonds and interactions found in the tertiary level of structure.
A protein with quaternary structure can consist of just two subunits that are identical (e.g., the Cro protein found in bacteriophage λ).
The quaternary structure of a protein may also include polypeptides that are distinct in primary, secondary, and tertiary structures (e.g., hemoglobin, which consists of two identical copies of an α subunit and two identical copies of a β subunit).
Cells also contain macromolecular machines: groups of multiple proteins that assemble to carry out a particular function.

Folding and Function

If one of the polypeptides in hemoglobin were synthesized from individual amino acids and placed in an aqueous solution, it would spontaneously fold into the shape of the tertiary structure.
This may seem to be in conflict with the second law of thermodynamics, because an unfolded protein has many more ways to move about, it has much higher entropy than the folded version.
Folding does tend to be spontaneous because the chemical bonds and interactions that occur release enough energy to overcome this decrease in entropy and will also increase entropy in the surrounding environment.

Normal Folding Is Crucial to Function

Christian Anfinsen studied a protein called ribonuclease that cleaves ribonucleic acid (RNA) polymers.
He found that ribonuclease could be unfolded, or denatured, by treating it with compounds that break hydrogen bonds and disulfide bonds.
The denatured ribonuclease was unable to function normally—it could no longer break apart nucleic acids.
When the chemical denaturing agents were removed, ribonuclease refolded spontaneously and began to function normally again.
These experiments confirmed that the primary sequence contains all the information required for folding and that folding is essential for protein function.

Protein Shape Is Flexible

Although each protein has a characteristic folded shape that is necessary for its function, most proteins maintain a flexible and dynamic shape when they are not actively performing that function.
Over half of the proteins that have been analyzed to date have disordered regions lacking any apparent structure when they are in an inactive state.

Protein Folding Is Often Regulated

Since the function of a protein is dependent on its shape, controlling when or where it is folded into its active shape will regulate the protein’s activity.
Proteins involved in sending signals within cells, for example, are often regulated in this way.

Folding Can Be “Infectious”

Certain normal proteins can be induced to fold into infectious, disease-causing agents called prions.
The normal proteins are known as prion proteins, or PrP for short.
The infectious prion proteins are known as PrP*, or simply prions.
PrP and PrP* do not differ in their primary structure; it is only their shapes that are radically different.
When an infectious prion comes in contact with a normal prion, the two bind and the normal one changes shape to become another infectious one; these infectious prions cause the death of brain cells.

Protein Functions Are as Diverse as Protein Structures

Catalysis: Many proteins are specialized to catalyze, or speed up, chemical reactions.
Defence: Proteins called antibodies attack and destroy viruses and bacteria that cause disease.
Movement: Motor proteins and contractile proteins are responsible for moving the cell itself.
Signalling: Proteins are involved in carrying and receiving signals from cell to cell inside the body.
Structure: Structural proteins make up body components such as fingernails and hair.
Transport: Some proteins allow particular molecules to enter and exit cells, while others carry molecules throughout the body.

Why Are Enzymes Good Catalysts?

Catalyzed reactions involve one or more reactants called substrates.
The initial hypothesis for how enzymes work was proposed by Emil Fischer: the “lock-and-key” model.

Explain How a Huge Diversity of Polymers Can Arise from a Small Number of Monomers

A vast diversity of polymers can arise from a small number of monomers through several mechanisms:

Combinatorial Diversity: Even with a limited set of monomers, the number of possible polymer sequences is enormous. For example, proteins are constructed from 20 different amino acids. The number of possible sequences for a polypeptide of $n$ amino acids is $20^n$ . Even for a relatively small protein of 100 amino acids, the number of possible sequences is $20^{100}$ , which is an astronomical number.
Arrangement and Order: The specific arrangement and order of monomers in a polymer chain contribute significantly to its diversity. Different arrangements lead to different properties and functions of the resulting polymer. The primary structure of a protein, which is the sequence of amino acids, dictates its overall structure and function.
Structural Variation: Polymers can form diverse structures based on their monomer composition and sequence, leading to different secondary, tertiary, and quaternary structures in proteins. These higher-order structures determine the functional properties of the polymer.
Chemical Modifications: After polymerization, monomers within a polymer can undergo chemical modifications, such as glycosylation, phosphorylation, or methylation. These modifications further diversify the properties and functions of the polymer.

Describe the Structure of an Amino Acid

Amino acids share a common core structure. In all 20 amino acids, a central carbon atom (α-carbon) bonds covalently to:

H: a hydrogen atom
$NH_2$ : an amino functional group
COOH: a carboxyl functional group
A distinctive “R-group” (side chain)

In water (pH 7), amino acids ionize:

The amino group acts as a base, attracting a proton to form $NH_3^+$ .
The carboxyl group acts as an acid, losing a proton to form COO-.

Recognize Whether an R Group Is Polar, Polar, or Nonpolar, and Describe How This Impacts the Role of That R Group in Protein Folding

The R-groups determine the properties of amino acids, affecting their solubility and interactions within a protein.

Polar R-groups: These contain atoms with partial charges due to electronegativity differences (e.g., carbon with oxygen). Polar R-groups are hydrophilic and tend to reside on the exterior of proteins, interacting with the aqueous environment.
Nonpolar R-groups: These are composed mostly of carbon and hydrogen atoms. Nonpolar R-groups are hydrophobic and tend to cluster in the interior of proteins, away from water. This hydrophobic effect drives protein folding, stabilizing the structure.
Charged R-groups (Acidic and Basic): Acidic R-groups have a negative charge (COO-) and basic R-groups have a positive charge ( $NH_3^+$ ). These charged R-groups are hydrophilic and typically found on the protein surface, where they can interact with water and other charged molecules.

The polarity and charge of R-groups affect the solubility of an amino acid in the aqueous interior of the cell.

Polar and electrically charged R-groups interact readily with water and are hydrophilic.
Nonpolar R-groups lack charged or highly electronegative atoms and are hydrophobic.

Explain How Amino Acids Are Linked Together to Form Polypeptides

Amino acids are linked together through peptide bonds to form polypeptides.

Amino acids polymerize when a bond forms between the carboxyl group of one amino acid and the amino group of another.
The C-N covalent bond is called a peptide bond.
The carboxyl group is converted to a carbonyl functional group (C=O).
The amino group becomes simply N-H in the resulting polymer.
When amino acids are linked by peptide bonds into a chain, they are referred to as “residues”.
Monomers polymerize through condensation reactions (dehydration reactions), which result in the loss of a water molecule.
Hydrolysis breaks polymers apart by adding a water molecule.

Describe the Four Levels of Protein Structure, Including the Interactions Involved in Each

Primary Structure: The unique sequence of amino acids in a protein.
- Determined by covalent peptide bonds between amino acids.
Secondary Structure: Localized folding patterns, such as α-helices and β-pleated sheets, stabilized by hydrogen bonds between amino and carboxyl groups in the polypeptide backbone.
- α-helix: The polypeptide’s backbone is coiled.
- β-pleated sheet: Segments of a peptide chain bend 180° and then fold in the same plane.
Tertiary Structure: The overall three-dimensional shape of a single polypeptide chain, resulting from interactions between R-groups.
- Hydrogen bonding: forms between polar side chains.
- Hydrophobic interactions: water molecules interact with hydrophilic polar side chains.
- van der Waals interactions: electrical attractions may stabilize the structure.
- Covalent bonding: can form between the side chains of two cysteines through a reaction between the sulfhydryl groups.
- Ionic bonding: may form between groups that have full and opposing charges.
Quaternary Structure: The arrangement of multiple polypeptide subunits in a multi-subunit protein.
- Held together by the same types of bonds and interactions found in the tertiary level of structure.

Explain How Proteins Can Become Denatured

Denaturation is the process by which a protein loses its native conformation, resulting in the loss of its biological activity. Proteins can be denatured by:

Heat: Increased temperature can disrupt hydrogen bonds and hydrophobic interactions.
pH: Extremes of pH can alter the ionization state of amino acid side chains, disrupting ionic bonds and hydrogen bonds.
Chemical Denaturants: Substances like urea and guanidine hydrochloride can disrupt hydrogen bonds and hydrophobic interactions.
Mechanical Agitation: Vigorous mixing can introduce air bubbles, leading to protein unfolding at air-water interfaces.

Describe the Structure of a Nucleotide

A nucleotide consists of three components:

Nitrogenous Base: A heterocyclic ring structure containing nitrogen atoms. The nitrogenous base can be a purine (adenine or guanine) or a pyrimidine (cytosine, thymine, or uracil).
Pentose Sugar: A five-carbon sugar molecule. In DNA, the sugar is deoxyribose; in RNA, the sugar is ribose.
Phosphate Group(s): One to three phosphate groups attached to the 5' carbon of the pentose sugar.

Distinguish Between Pyrimidines and Purines, and List the Nitrogenous Bases That Fall into Each Category

Purines: These have a double-ring structure.
- Adenine (A)
- Guanine (G)
Pyrimidines: These have a single-ring structure.
- Cytosine (C)
- Thymine (T) (only in DNA)
- Uracil (U) (only in RNA)

Distinguish Between Ribose and Deoxyribose

Ribose: A pentose sugar with a hydroxyl group (-OH) on the 2' carbon.
Deoxyribose: A pentose sugar with a hydrogen atom (H) on the 2' carbon (i.e., it lacks the oxygen atom at the 2' position).

The absence of the hydroxyl group in deoxyribose makes DNA more stable than RNA.

Distinguish Between the 5’ End and the 3’ End of a Nucleotide/Polynucleotide

5' End: The end of a nucleotide or polynucleotide chain with a phosphate group attached to the 5' carbon of the sugar.
3' End: The end of a nucleotide or polynucleotide chain with a hydroxyl group (-OH) attached to the 3' carbon of the sugar.

Polynucleotides are synthesized by adding new nucleotides to the 3' end, forming a phosphodiester bond between the 3' -OH of the existing nucleotide and the 5' phosphate of the incoming nucleotide.

Explain How Nucleotides Are Linked Together to Form Polynucleotides

Nucleotides are linked together through phosphodiester bonds to form polynucleotides.

The phosphate group on the 5' carbon of one nucleotide forms a covalent bond with the 3' hydroxyl group of another nucleotide.
This bond involves a condensation reaction, where a water molecule is removed.
The resulting linkage is a phosphodiester bond, creating a sugar-phosphate backbone with the nitrogenous bases extending from it.

Describe Complementary Base Pairing, and Explain Only Certain Bases Pair with One Another

Complementary base pairing is the specific pairing of nitrogenous bases in nucleic acids:

Adenine (A) pairs with Thymine (T) in DNA (or Uracil (U) in RNA) via two hydrogen bonds.
Guanine (G) pairs with Cytosine (C) via three hydrogen bonds.

The specificity of base pairing is due to the arrangement of hydrogen bond donors and acceptors on the nitrogenous bases. The correct alignment of these groups allows for stable hydrogen bond formation, while incorrect pairings do not provide sufficient stability.

Differentiate Between DNA and RNA with Respect to General Function, Structure of Nucleotides, Secondary Structure, and Tertiary Structure

Feature	DNA	RNA
General Function	Stores genetic information; template for replication and transcription.	Involved in gene expression (translation); carries genetic information in some viruses.
Structure of Nucleotides	Deoxyribose sugar, phosphate group, and nitrogenous bases (A, T, C, G).	Ribose sugar, phosphate group, and nitrogenous bases (A, U, C, G).
Secondary Structure	Typically a double helix, with two complementary strands held together by hydrogen bonds between base pairs.	Can form various secondary structures, such as hairpin loops, stem-loops, and internal loops, through intramolecular base pairing.
Tertiary Structure	DNA is organized into chromosomes, which are further compacted through supercoiling and association with histone proteins.	RNA molecules fold into complex three-dimensional structures, often stabilized by metal ions and protein interactions. Examples include tRNA (transfer RNA) and rRNA (ribosomal RNA), which have specific