DNA and the Gene: Synthesis and Repair — Comprehensive Notes

Evidence and Historical Context: What Carries Genetic Information?

Early question: which component of chromosomes carries inheritance, DNA or proteins?
- Chromosomes contain both DNA and proteins; scientists debated which holds genetic info.
Proteins appeared more diverse than nucleic acids (20 amino acids) and seemed capable of complex tasks.
DNA composition: nitrogenous bases (A, T, C, G), deoxyribose sugar, and phosphate groups; initially seemed too simple to encode traits.
Key historic problem: identify the molecular basis of heredity and the mechanism by which genetic information is stored and transmitted.

Griffith, Avery, and the Discovery of DNA as Genetic Material

Griffith (1928) studied Streptococcus pneumoniae with two strains: S (virulent, capsulated) and R (nonvirulent, noncapsulated).
- S strain killed mice; R strain did not.
- Heat-killed S plus live R transformed mice to die, and live S was recovered from mice.
- Conclusion: some substance transformed R into virulent S; this “transformation” implied a heritable material.
Oswald Avery and collaborators extended Griffith’s results (late 1940s–1950s):
- Treated “transforming material” with proteases, RNases, or DNases before mixing with R strain.
- Transformation occurred only when DNase was not present; DNase-treated material failed to transform.
- Conclusion: DNA is the transforming material responsible for heredity in this system.
Hershey–Chase experiment (1952) with bacteriophages (phage T2):
- Two labels: radioactive sulfur to label proteins (since proteins contain sulfur but DNA does not) and radioactive phosphorus to label DNA (DNA contains phosphorus but proteins do not).
- After infection of E. coli, only the phosphorus-labeled material entered the bacterial cells and directed production of new phages.
- Conclusion: DNA, not protein, is the genetic material that transfers information in the phage life cycle.
Summary of impact: Together these experiments established that DNA is the material responsible for heredity across diverse organisms and systems, setting the stage for understanding DNA structure and replication.

The DNA Molecule: Structure and Early Clues

DNA is a polymer of nucleotides: each nucleotide comprises a sugar (deoxyribose), a phosphate group, and a nitrogenous base (A, T, C, G).
Nucleotides are linked by phosphodiester bonds to form the sugar–phosphate backbone; bases project outward.
Chargaff’s rules (1940s):
- The amount of adenine roughly equals thymine; the amount of cytosine roughly equals guanine within a given species.
- Overall base composition varies between species.
- Represented as:
  ext{ } ext{%A} \,=\, ext{%T}, \ ext{%G} \,=\, ext{%C}
Rosalind Franklin’s X-ray crystallography data contributed critical information about DNA width and regular helical geometry, informing the correct double-helix structure.
Watson and Crick (with crucial input from Franklin’s data) proposed the double-helix model: two antiparallel strands with bases on the inside and a sugar-phosphate backbone on the outside.
Complementary base pairing established by the model: $A \leftrightarrow T, \quad G \leftrightarrow C$ with hydrogen bonding, providing a mechanism for accurate replication.
Key terminology:
- Nucleotides vs nucleosides: nucleotides contain a base, sugar, and phosphate; nucleosides contain base + sugar.
- Antiparallel strands: one strand runs 5′→3′ and the other runs 3′→5′.
- Double helix: two strands coiled around each other, bases paired inside, backbone outside.

Structure in Depth: The Double Helix and Its Consequences

The double helix has nitrogenous bases on the inside and a sugar–phosphate backbone on the outside.
Complementary base pairing explained biological stability and replication fidelity:
- $A \text{ pairs with } T \, (2\text{ hydrogen bonds})$
- $G \text{ pairs with } C \, (3\text{ hydrogen bonds})$
Nucleotides and their components:
- Nucleotides: $\text{nitrogenous base} + \text{sugar (deoxyribose)} + \text{phosphate group}$
- Nucleosides: $\text{nitrogenous base} + \text{sugar}$
Watson–Crick model vs Pauling’s early triple-helix idea:
- Early model by Linus Pauling proposed a three-stranded structure with phosphate groups inside (model a, later deemed incorrect: model b is the correct double helix).
Rosalind Franklin’s contributions included providing measurements that supported the width and regularity of the DNA helix.

From Structure to Function: DNA Replication and the Semiconservative Model

After establishing the structure, researchers sought to understand function, particularly replication.
Watson–Crick hypothesized a semiconservative mode of replication:
- Each daughter double helix contains one parental strand and one newly synthesized strand.
- Other proposed models were: conservative (parents rejoin after duplication) and dispersive (each strand contains a mixture of old and new DNA).
Meselson–Stahl experiment (1952) provided crucial support for semiconservative replication:
- Grew E. coli in heavy nitrogen source $^{15}\mathrm{N}$ to label parental DNA.
- Transferred to light nitrogen source $^{14}\mathrm{N}$ and allowed replication.
- Used density gradient centrifugation to separate DNA by mass.
- Results after successive generations supported semiconservative replication: after generation 1 a hybrid (intermediate density) DNA band appeared; after generation 2, both light (14N) and hybrid bands were present, consistent with semiconservative replication.
Experimental setup and predictions (semiconservative, conservative, dispersive) summarized as: after generations, the density patterns reflect the replication mode:
- Semiconservative: Generation 0 = heavy; Generation 1 = one hybrid; Generation 2 = both light and hybrid.
- Conservative: Generation 1 = all heavy or all light; no hybrid.
- Dispersive: Generation 2 would yield all intermediate densities.

The DNA Replication Machinery: How It Works

DNA replication is carried out by a large protein complex often called the replisome.
- The replisome may be stationary; parental DNA is reeled in while daughter DNA is extruded.
- Key components include: Helicase, SSBs (Single-Strand DNA-binding proteins), Topoisomerase, Primase, DNA polymerase III, DNA polymerase I, Sliding clamp, and DNA ligase.
Origins of replication and replication forks:
- Replication begins at origins where strands separate and form a replication fork.
- Replication proceeds in both directions from origins.
Enzymes and roles (simplified):
- Helicase: opens the double helix by breaking hydrogen bonds.
- SSBPs: stabilize single-stranded DNA to prevent reannealing.
- Topoisomerase: relieves supercoiling/twisting ahead of the fork.
- Primase: makes RNA primers to provide a 3′-OH for DNA synthesis.
- DNA polymerase III: extends new DNA strands; the main polymerase for synthesis.
- DNA polymerase I: removes RNA primers and replaces with DNA.
- DNA ligase: seals nicks between Okazaki fragments on the lagging strand.
- Sliding clamp: tethers DNA polymerase III to DNA, increasing processivity.
Leading vs lagging strand synthesis:
- Leading strand: synthesized continuously in the 5′→3′ direction toward the fork.
- Lagging strand: synthesized discontinuously as Okazaki fragments in the 5′→3′ direction away from the fork.
Synthesis specifics (as shown in the figures):
- Primase lays down RNA primers on both strands.
- DNA polymerase III extends primers to synthesize new DNA.
- RNA primers are replaced with DNA by DNA polymerase I.
- DNA ligase joins adjacent fragments to form a continuous strand.
The replisome may include a coordinated assembly where polymerases are organized to efficiently replicate both strands.
The concept of the replication fork moving in two directions from each origin explains rapid genome duplication.

Replicating the Ends of Linear Chromosomes: Telomeres and Telomerase

A key problem for linear eukaryotic chromosomes: DNA polymerase cannot fully replicate the 5′ ends, leading to progressive shortening after each round of replication.
- This end-replication problem occurs because the lagging strand cannot be fully copied at the extreme end once RNA primer is removed.
- Prokaryotic genomes (circular) do not face this issue.
Telomeres: repetitive DNA at chromosome ends that protects genes from erosion.
- Human telomeric repeat sequence: $ext{TTAGGG}$ (repeated many times).
Telomerase: a ribonucleoprotein enzyme that extends the 3′ end of the parental DNA using its own RNA template as a guide for adding repeats.
- Telomerase is especially active in germ cells and certain stem cells; somatic cells generally have low or no telomerase activity.
Telomere replication process (summary):
1) One parental strand end remains unreplicated after RNA primer removal.
2) Telomerase binds to the 3′ end and extends it using its own RNA template to add TTAGGG repeats.
3) Telomerase shifts and repeats extension multiple times, creating an extended single-stranded overhang.
4) Standard DNA synthesis on the extended template fills in the complementary strand, producing a protected double-stranded region and preventing chromosome shortening.
Implications: telomerase activity influences cellular aging and has implications in cancer biology where telomerase can be reactivated to enable limitless replication.

DNA Repair and the Maintenance of Genetic Integrity

DNA polymerases have proofreading activity, correcting misincorporated nucleotides during replication.
If errors escape proofreading, cells employ various repair pathways:
- Mismatch repair: enzymes identify and fix mispaired bases after DNA replication.
- Nucleotide damage from environmental/physical agents (e.g., UV irradiation) or spontaneous changes can create lesions.
UV damage example: thymine dimers form when two adjacent thymines covalently bond, distorting the DNA helix.
Nucleotide excision repair (NER) overview:
1) Damage detection by a protein complex recognizing DNA irregularity.
2) Nicks are created on both sides of the damaged segment.
3) DNA helicase unwinds and removes the damaged segment.
4) DNA polymerase fills in the resulting gap using the undamaged strand as a template.
5) DNA ligase seals the final nick, restoring integrity.
The net effect of these repair systems is to minimize mutations, though errors are not entirely eliminated, which is central to genetic variation and evolution.
Evolutionary significance of errors:
- Error rates are reduced but not zero after proofreading and repair.
- Mutations introduce variation that natural selection can act upon, driving adaptation and speciation over time.

From DNA to Protein: The Central Dogma and Gene Expression

DNA resides in the nucleus and encodes information used to synthesize proteins.
Messenger RNA (mRNA) carries genetic information from DNA to ribosomes in the cytoplasm where translation occurs.
Translation occurs on ribosomes to produce proteins; this links DNA to phenotype and cellular function.
Gene manipulation and cloning in the lab allow in vitro replication and study of DNA sequences outside cells (e.g., PCR, cloning).

In Vitro DNA Amplification: Polymerase Chain Reaction (PCR)

PCR purpose: amplify a specific DNA segment to produce many copies for analysis.
Basic setup: a modern PCR machine cycles through three steps repeatedly to amplify the target sequence:
- Denaturation (heating, separates strands)
- Annealing (cooling, primers bind to target sequences)
- Extension (DNA polymerase extends from primers to synthesize new DNA)
Key components:
- DNA template containing the target sequence
- Two primers that flank the target region and provide starting points for synthesis (primers conceptually analogous to the RNA primers used by primase in vivo)
- DNA polymerase (often a heat-stable enzyme such as Taq polymerase)
- Deoxynucleotide triphosphates (dNTPs)
- Buffer and ions to optimize enzyme activity
How a single PCR cycle works (Cycle 1 as example):
- Denaturation: separate double-stranded DNA into two single strands.
- Annealing: primers bind (hydrogen bonding) to their complementary sequences on the template strands.
- Extension: Taq polymerase extends from the 3′ ends of primers, synthesizing new strands in the 5′→3′ direction.
Quantum of amplification: each cycle roughly doubles the amount of target DNA, so after n cycles the number of copies is approximately: $N<em>n = N</em>0 \, 2^n$
- Where $N_0$ is the initial quantity of target DNA.
Cycle progression and yields (typical progression):
- Cycle 1 yields 2 molecules
- Cycle 2 yields 4 molecules
- Cycle 3 yields 8 molecules
- A standard procedure often runs about 30 cycles, potentially yielding over a billion copies of the target sequence:
  $2^{30} \approx 1.07 \times 10^9$
Practical considerations:
- Primer design is critical to flank the desired gene/sequence and avoid off-target amplification.
- High temperatures used in denaturation can degrade enzymes unless a heat-stable polymerase is used (e.g., Taq polymerase).
Applications: cloning, sequencing, diagnostics, forensic analysis, research involving gene fragments.

The DNA Replication Complex and Replisome: A Coordinated Machine

The replication process involves a large assembly of proteins forming a DNA replication machine (replisome).
The idea that replication machinery may be stationary with DNA moving through is supported by some studies; the precise mechanism remains an active area of research.
Core components include the sliding clamp, primers, ligase, helicase, primase, SSBPs, topoisomerase, and DNA polymerases I and III. These components coordinate to ensure faithful and efficient replication of both leading and lagging strands.

Key Quantitative and Conceptual Points to Remember

DNA polymerases synthesize DNA in the 5′→3′ direction; synthesis on the lagging strand occurs discontinuously via Okazaki fragments.
Chargaff’s rules: ext{%A} = ext{%T}, \ ext{%G} = ext{%C}, with species-specific variation in base composition.
The Anecdotes that Shaped Understanding:
- Griffith: transformation by a substance from heat-killed S to live R.
- Avery: DNA (not protein or RNA) is responsible for transformation.
- Hershey–Chase: DNA enters cells and directs viral production; confirms DNA as genetic material.
- Meselson–Stahl: semiconservative replication confirmed in E. coli.
- Franklin’s data: crucial structural information for the DNA helix.
Telomeres and telomerase address the end-replication problem; telomerase extends the 3′ end to prevent progressive chromosome shortening in germ cells and some stem cells.
DNA repair maintains genome integrity but is not perfect; mutations arise and drive evolution.
UV-induced thymine dimers exemplify the need for nucleotide excision repair to excise damaged DNA and replace it with correct nucleotides.

Quick Reference: Important Sequences and Concepts

Telomere repeat in humans: $ext{TTAGGG}$ (repeated at chromosome ends)
Base-pairing: $A \leftrightarrow T, \ G \leftrightarrow C$
Mis-match repair and nucleotide excision repair pathways safeguard genome fidelity; repair steps include detection, nicking, excision, resynthesis, and ligation.
The primary in vitro amplification technique (PCR) relies on denaturation, annealing, and extension, with cycle numbers exponentially increasing target DNA copies:
$N<em>n = N</em>0 2^n$
The central dogma underpins the flow of genetic information: DNA -> RNA -> Protein; transcription produces mRNA, translation occurs at ribosomes in the cytoplasm.

Connections to Real-World Relevance

PCR is foundational to modern biology and medicine, enabling rapid DNA amplification for diagnostics, forensics, cloning, and genomic research.
Understanding telomeres and telomerase informs aging research and cancer biology, where telomerase activity is often upregulated to enable unchecked cell division.
DNA repair mechanisms explain how organisms cope with environmental mutagens (UV light, chemicals) and why exposure can lead to mutations with evolutionary consequences or disease.
The historical experiments (Griffith, Avery, Hershey–Chase, Meselson–Stahl) illustrate the scientific method: observation, hypothesis, experimentation, and validation across systems.

Note: If you want a condensed version for quick review, I can generate a shorter outline with the key takeaways and essential formulas. If you’d like, I can also add a glossary of terms or a diagram-driven companion sheet.