DNA Structure, Base Pairing, X-ray Diffraction, and Intro to Proteins

DNA Base Pairing and Structure

Base pairing rules in DNA:
- Adenine (A) pairs with Thymine (T) via two hydrogen bonds: $\text{A--T base pair} \quad (2\;\text{H-bonds})$
- Cytosine (C) pairs with Guanine (G) via three hydrogen bonds: $\text{G--C base pair} \quad (3\;\text{H-bonds})$
- Thymine is a pyrimidine; Adenine is a purine; Cytosine is a pyrimidine; Guanine is a purine.
- Complementary base pairing requires one purine and one pyrimidine in each pair, maintaining uniform helix width.
- In RNA, Uracil (U) replaces Thymine and pairs with Adenine; however, this transcript focuses on DNA base pairing.
Chargaff’s principle (base composition in double-stranded DNA):
- The percentage of purines equals the percentage of pyrimidines: $\%\text{Purines} = \%\text{Pyrimidines}$
- More specifically (classic form): $\%A = \%T\quad\text{and}\quad\%G = \%C.$
- From any given base percentage, you can infer the others (e.g., if you know \%A\, $you can deduce$ \%T,\%G,\%C).
Structural implications for double-stranded DNA:
- Each base pair consists of a purine-pyrimidine pair, which ensures uniform width.
- The two strands are antiparallel (5' to 3' on one strand and 3' to 5' on the other).
- The bases face inward toward the interior of the helix; the sugar–phosphate backbone is on the outside.
- The diameter is about d = 2\ \text{nm} $and there are approximately$ n_{\text{bp/turn}} = 10 base pairs per turn.
- The helix is stabilized by hydrogen bonding between base pairs and base stacking interactions.
Directionality and complementarity in sequence:
- If one strand runs 5' to 3', the complementary strand runs 3' to 5'.
- Example understanding (not a fixed example from the transcript): If one strand sequence is 5'--ATGCCG--3', the complementary strand is 3'--TACGGC--5' (written in the opposite orientation).
- Individual bases pair according to rules: A pairs with T; G pairs with C; and in RNA, A pairs with U.
DNA physical features discussed in the lecture:
- The outer edges of bases facing outward can interact with proteins and other molecules via the grooves.
- Major groove vs minor groove: the major groove is wide and more accessible for molecular interaction; the minor groove is narrower.
- The sugar-phosphate backbone forms the exterior scaffold; bases are largely internal.
X-ray diffraction and historical context (Photo 51):
- DNA structure was elucidated with X-ray diffraction data from DNA samples.
- The diffraction pattern helped reveal the double helical structure, inward-facing bases, and outward-facing sugar-phosphate backbone.
- Key measurements revealed by the data:
- Double helix confirmation.
- Bases facing inward; backbone facing outward.
- Diameter d = 2\ \text{nm}.
- Number of nucleotides per turn: n_{\text{bp/turn}} = 10.
- Antiparallel strand orientation.
- Photo 51 (1952) was produced by Rosalind Franklin; this data was pivotal in confirming the double-helix model.
- Watson and Crick later proposed the canonical DNA model based on this data plus Chargaff's rules and other evidence.
- Ethical note: Franklin’s contribution was not initially credited in the Nobel Prize recognition for Watson and Crick; contemporary discussions emphasize giving her due credit.
- The Watson–Crick model proposed base pairing rules that match Chargaff’s data and explain the uniform width of the helix.
RNA vs DNA context (brief):
- RNA is typically single-stranded and can fold back on itself to form intramolecular base pairing.
- RNA bases include Uracil (U) instead of Thymine (T).
- Intramolecular base pairing in RNA can lead to hairpin structures and complex three-dimensional shapes used in diverse functions; more detail next week.

Proteins: Roles and Basic Biochemistry

Proteins as cellular workhorses:
- Proteins perform a wide variety of functions in the cell and can be organized into functional categories such as:
- Enzymes (catalysts)
- Structural proteins (stability and movement)
- Regulatory proteins (gene expression, signaling)
- Transport and motor proteins (movement across membranes, intracellular transport)
- Storage proteins (molecule storage)
- The lecture emphasizes a “cell factory” view where proteins are the workers.
General structure of proteins: amino acids and polypeptides
- Proteins are polymers built from amino acids linked together by covalent bonds (peptide bonds).
- The monomers are amino acids, connected by condensation (dehydration) reactions that release water.
- A generic amino acid structure:
- Central carbon (the alpha carbon, \alpha\text{-C})
- An amino group (–NH$_2$) attached to the alpha carbon
- A carboxyl group (–COOH) attached to the alpha carbon
- A side chain (the R group) attached to the alpha carbon, which defines the identity and properties of the amino acid
- A hydrogen attached to the alpha carbon
- Each amino acid can form up to four covalent bonds around the alpha carbon, giving it a tetrahedral geometry.
- The side chain (R group) varies among amino acids and determines chemical properties and interactions with other molecules.
- The side chains (R groups) give amino acids their chemical properties and drive protein folding and function.
Classification of amino acid side chains by chemical properties (as discussed in the slides):
- Hydrophilic and charged side chains (polar or ionic): e.g., arginine (Arg, 1-letter code: R) and aspartic acid (Asp, 1-letter code: D). The example in the transcript shows aspartic acid ASP with a negatively charged carboxylate group (COO⁻).
- Hydrophilic but uncharged (polar) side chains: e.g., threonine (Thr) with a hydroxyl group, capable of hydrogen bonding with water.
- Hydrophobic (nonpolar) side chains: predominantly hydrocarbon-rich groups; tend to be buried inside proteins away from water.
- Special/corner cases:
- Cysteine (Cys) has a sulfur-containing side chain capable of forming disulfide bonds with another cysteine: –S–S–, which can stabilize protein structure.
- Glycine (Gly) has a hydrogen as its side chain; smallest amino acid, providing flexibility in protein structure.
- Proline (Pro) has a ring structure that covalently links to the amino group, creating a rigid kink and unique conformational properties.
Nomenclature and numbering of amino acids in a protein
- Each amino acid is typically represented by a three-letter code and a one-letter code (the latter is widely used in sequence notation).
- Example provided in the transcript: Aspartic acid has the three-letter code ASP and the one-letter code D.
- On a polypeptide, the chain has two distinct ends:
- N-terminus (the end with the amino group, –NH$_2$) – the start of the polypeptide chain.
- C-terminus (the end with the carboxyl group, –COOH) – the end of the chain.
- Amino acids are covalently linked in a linear, unbranched chain (polypeptide); carbohydrates can branch, but polypeptides are unbranched.
The peptide bond and polypeptide chain (an example of how amino acids link)
- A condensation reaction links the amino group of one amino acid to the carboxyl group of the adjacent amino acid, releasing a molecule of water (H$_2$O).
- The illustration described in the transcript labels the N-terminus and C-terminus for two adjacent amino acids, showing:
- The N-terminus has the amino group (–NH$_2$) of the first amino acid.
- The C-terminus has the carboxyl group (–COOH) of the second amino acid.
- The side chain (R) corresponds to the unique property of each amino acid.
- The order of amino acids in a protein is critical to its structure and function; sequencing is often written from N-terminus to C-terminus.
Key takeaways on protein structure (in context of this lecture)
- Proteins are polymers of amino acids linked by peptide bonds via condensation reactions.
- The chemical nature of side chains drives folding, structure, and interactions with other molecules.
- The polypeptide chain is unbranched and has directionality from N-terminus to C-terminus.
- The study of amino acids includes understanding their three-letter and one-letter codes and recognizing how side chain chemistry dictates properties like hydrophilicity, charge, and potential for hydrogen bonding.
Connections to broader topics and real-world relevance
- DNA structure and base pairing underpin genetic information storage and replication; RNA structure relates to transcription and translation, and intramolecular base pairing in RNA enables diverse RNA structures with catalytic or regulatory roles.
- Understanding amino acid properties is foundational for predicting protein folding, function, enzyme activity, and interactions with other biomolecules.
- The ethical and historical context of Rosalind Franklin’s contribution to the discovery of DNA’s structure highlights the importance of credit and recognition in scientific progress.
Quick reference formulas and constants from the lecture
- A–T base pair hydrogen bonds: 2
- G–C base pair hydrogen bonds: 3
- DNA diameter: d = 2\ \text{nm}
- Base pairs per turn: n_{\text{bp/turn}} = 10
- Chargaff’s rule (general form): \%A = \%T, \quad \%G = \%C
- Complementarity rules: \text{A} \leftrightarrow \text{T}, \quad \text{G} \leftrightarrow \text{C}$$
Note on future topics mentioned
- In upcoming sessions, more on RNA structure and the variety of RNA types (e.g., rRNA, tRNA, mRNA) and their three-dimensional structures will be explored in greater depth.