DNA Structure, Base Pairing, X-ray Diffraction, and Intro to Proteins

DNA Base Pairing and Structure

  • Base pairing rules in DNA:

    • Adenine (A) pairs with Thymine (T) via two hydrogen bonds: A–T base pair(2  H-bonds)\text{A--T base pair} \quad (2\;\text{H-bonds})
    • Cytosine (C) pairs with Guanine (G) via three hydrogen bonds: G–C base pair(3  H-bonds)\text{G--C base pair} \quad (3\;\text{H-bonds})
    • Thymine is a pyrimidine; Adenine is a purine; Cytosine is a pyrimidine; Guanine is a purine.
    • Complementary base pairing requires one purine and one pyrimidine in each pair, maintaining uniform helix width.
    • In RNA, Uracil (U) replaces Thymine and pairs with Adenine; however, this transcript focuses on DNA base pairing.
  • Chargaff’s principle (base composition in double-stranded DNA):

    • The percentage of purines equals the percentage of pyrimidines: %Purines=%Pyrimidines\%\text{Purines} = \%\text{Pyrimidines}
    • More specifically (classic form): %A=%Tand%G=%C.\%A = \%T\quad\text{and}\quad\%G = \%C.
    • From any given base percentage, you can infer the others (e.g., if you know \%A\,youcandeduceyou can deduce\%T,\%G,\%C).
  • Structural implications for double-stranded DNA:

    • Each base pair consists of a purine-pyrimidine pair, which ensures uniform width.
    • The two strands are antiparallel (5' to 3' on one strand and 3' to 5' on the other).
    • The bases face inward toward the interior of the helix; the sugar–phosphate backbone is on the outside.
    • The diameter is about d = 2\ \text{nm}andthereareapproximatelyand there are approximatelyn_{\text{bp/turn}} = 10 base pairs per turn.
    • The helix is stabilized by hydrogen bonding between base pairs and base stacking interactions.
  • Directionality and complementarity in sequence:

    • If one strand runs 5' to 3', the complementary strand runs 3' to 5'.
    • Example understanding (not a fixed example from the transcript): If one strand sequence is 5'--ATGCCG--3', the complementary strand is 3'--TACGGC--5' (written in the opposite orientation).
    • Individual bases pair according to rules: A pairs with T; G pairs with C; and in RNA, A pairs with U.
  • DNA physical features discussed in the lecture:

    • The outer edges of bases facing outward can interact with proteins and other molecules via the grooves.
    • Major groove vs minor groove: the major groove is wide and more accessible for molecular interaction; the minor groove is narrower.
    • The sugar-phosphate backbone forms the exterior scaffold; bases are largely internal.
  • X-ray diffraction and historical context (Photo 51):

    • DNA structure was elucidated with X-ray diffraction data from DNA samples.
    • The diffraction pattern helped reveal the double helical structure, inward-facing bases, and outward-facing sugar-phosphate backbone.
    • Key measurements revealed by the data:
    • Double helix confirmation.
    • Bases facing inward; backbone facing outward.
    • Diameter d = 2\ \text{nm}.
    • Number of nucleotides per turn: n_{\text{bp/turn}} = 10.
    • Antiparallel strand orientation.
    • Photo 51 (1952) was produced by Rosalind Franklin; this data was pivotal in confirming the double-helix model.
    • Watson and Crick later proposed the canonical DNA model based on this data plus Chargaff's rules and other evidence.
    • Ethical note: Franklin’s contribution was not initially credited in the Nobel Prize recognition for Watson and Crick; contemporary discussions emphasize giving her due credit.
    • The Watson–Crick model proposed base pairing rules that match Chargaff’s data and explain the uniform width of the helix.
  • RNA vs DNA context (brief):

    • RNA is typically single-stranded and can fold back on itself to form intramolecular base pairing.
    • RNA bases include Uracil (U) instead of Thymine (T).
    • Intramolecular base pairing in RNA can lead to hairpin structures and complex three-dimensional shapes used in diverse functions; more detail next week.

Proteins: Roles and Basic Biochemistry

  • Proteins as cellular workhorses:

    • Proteins perform a wide variety of functions in the cell and can be organized into functional categories such as:
    • Enzymes (catalysts)
    • Structural proteins (stability and movement)
    • Regulatory proteins (gene expression, signaling)
    • Transport and motor proteins (movement across membranes, intracellular transport)
    • Storage proteins (molecule storage)
    • The lecture emphasizes a “cell factory” view where proteins are the workers.
  • General structure of proteins: amino acids and polypeptides

    • Proteins are polymers built from amino acids linked together by covalent bonds (peptide bonds).
    • The monomers are amino acids, connected by condensation (dehydration) reactions that release water.
    • A generic amino acid structure:
    • Central carbon (the alpha carbon, \alpha\text{-C})
    • An amino group (–NH$_2$) attached to the alpha carbon
    • A carboxyl group (–COOH) attached to the alpha carbon
    • A side chain (the R group) attached to the alpha carbon, which defines the identity and properties of the amino acid
    • A hydrogen attached to the alpha carbon
    • Each amino acid can form up to four covalent bonds around the alpha carbon, giving it a tetrahedral geometry.
    • The side chain (R group) varies among amino acids and determines chemical properties and interactions with other molecules.
    • The side chains (R groups) give amino acids their chemical properties and drive protein folding and function.
  • Classification of amino acid side chains by chemical properties (as discussed in the slides):

    • Hydrophilic and charged side chains (polar or ionic): e.g., arginine (Arg, 1-letter code: R) and aspartic acid (Asp, 1-letter code: D). The example in the transcript shows aspartic acid ASP with a negatively charged carboxylate group (COO⁻).
    • Hydrophilic but uncharged (polar) side chains: e.g., threonine (Thr) with a hydroxyl group, capable of hydrogen bonding with water.
    • Hydrophobic (nonpolar) side chains: predominantly hydrocarbon-rich groups; tend to be buried inside proteins away from water.
    • Special/corner cases:
    • Cysteine (Cys) has a sulfur-containing side chain capable of forming disulfide bonds with another cysteine: –S–S–, which can stabilize protein structure.
    • Glycine (Gly) has a hydrogen as its side chain; smallest amino acid, providing flexibility in protein structure.
    • Proline (Pro) has a ring structure that covalently links to the amino group, creating a rigid kink and unique conformational properties.
  • Nomenclature and numbering of amino acids in a protein

    • Each amino acid is typically represented by a three-letter code and a one-letter code (the latter is widely used in sequence notation).
    • Example provided in the transcript: Aspartic acid has the three-letter code ASP and the one-letter code D.
    • On a polypeptide, the chain has two distinct ends:
    • N-terminus (the end with the amino group, –NH$_2$) – the start of the polypeptide chain.
    • C-terminus (the end with the carboxyl group, –COOH) – the end of the chain.
    • Amino acids are covalently linked in a linear, unbranched chain (polypeptide); carbohydrates can branch, but polypeptides are unbranched.
  • The peptide bond and polypeptide chain (an example of how amino acids link)

    • A condensation reaction links the amino group of one amino acid to the carboxyl group of the adjacent amino acid, releasing a molecule of water (H$_2$O).
    • The illustration described in the transcript labels the N-terminus and C-terminus for two adjacent amino acids, showing:
    • The N-terminus has the amino group (–NH$_2$) of the first amino acid.
    • The C-terminus has the carboxyl group (–COOH) of the second amino acid.
    • The side chain (R) corresponds to the unique property of each amino acid.
    • The order of amino acids in a protein is critical to its structure and function; sequencing is often written from N-terminus to C-terminus.
  • Key takeaways on protein structure (in context of this lecture)

    • Proteins are polymers of amino acids linked by peptide bonds via condensation reactions.
    • The chemical nature of side chains drives folding, structure, and interactions with other molecules.
    • The polypeptide chain is unbranched and has directionality from N-terminus to C-terminus.
    • The study of amino acids includes understanding their three-letter and one-letter codes and recognizing how side chain chemistry dictates properties like hydrophilicity, charge, and potential for hydrogen bonding.
  • Connections to broader topics and real-world relevance

    • DNA structure and base pairing underpin genetic information storage and replication; RNA structure relates to transcription and translation, and intramolecular base pairing in RNA enables diverse RNA structures with catalytic or regulatory roles.
    • Understanding amino acid properties is foundational for predicting protein folding, function, enzyme activity, and interactions with other biomolecules.
    • The ethical and historical context of Rosalind Franklin’s contribution to the discovery of DNA’s structure highlights the importance of credit and recognition in scientific progress.
  • Quick reference formulas and constants from the lecture

    • A–T base pair hydrogen bonds: 2
    • G–C base pair hydrogen bonds: 3
    • DNA diameter: d = 2\ \text{nm}
    • Base pairs per turn: n_{\text{bp/turn}} = 10
    • Chargaff’s rule (general form): \%A = \%T, \quad \%G = \%C
    • Complementarity rules: \text{A} \leftrightarrow \text{T}, \quad \text{G} \leftrightarrow \text{C}$$
  • Note on future topics mentioned

    • In upcoming sessions, more on RNA structure and the variety of RNA types (e.g., rRNA, tRNA, mRNA) and their three-dimensional structures will be explored in greater depth.