Biology Part One: DNA Structure and Restriction Enzymes

DNA Structure and Why It Matters

  • DNA is a double helix with hydrogen-bonded base pairs on the inside and phosphodiester backbones on the outside. The “dots” in diagrams represent hydrogen bonds between base pairs (nucleotides).

  • The two strands unzip temporarily during replication and transcription to allow reading or copying, so the hydrogen bonds in the middle are deliberately weak, while the phosphodiester backbone remains strong to hold the two strands together when separated.

  • Base pairs are formed by specific pairings: A\leftrightarrow T\quad\text{and}\quad G\leftrightarrow C. A simple memory aid is GCAT; A pairs with T and G pairs with C.

  • DNA is read in a 5′ to 3′ direction on each strand, which is important for how enzymes interact with the molecule during replication, transcription, and digestion by restriction enzymes.

Base Composition and Nucleotides

  • Bases can be categorized as purines or pyrimidines:

    • Purines have two rings: {A,\; G}.

    • Pyrimidines have one ring: {C,\; T,\; U} (U appears in RNA in place of T).

  • A simple way to distinguish pyrimidines from purines: pyrimidines contain the letter “Y” in their name (pyrimidines = single ring; those bases with a single ring include T, C, U).

  • In RNA, thymine (T) is replaced by uracil (U). DNA uses thymine; RNA uses uracil.

  • The bases are constructed from carbon, hydrogen, nitrogen, and oxygen atoms (with hydrogen bonds forming between complementary bases).

Genome Basics and Variation

  • The human genome contains about 6\times 10^{6} base pairs (≈ 6,000,000 bp).

  • Sequence identity across unrelated humans is very high: roughly >0.99 (over 99%) similarity between individuals.

  • Small genetic differences between individuals often come from single nucleotide polymorphisms (SNPs). A SNP is a single-base change in the genome (e.g., something like changing an A to a C at one position).

    • SNPs can alter how a gene functions or is regulated and can lead to phenotypic differences among people.

    • A SNP can change a restriction enzyme recognition site, thereby altering how enzymes cut a given DNA sequence.

  • Conceptual example: If a sequence is CTA A G T A in one version and a single nucleotide substitution occurs (e.g., C replaced by G or A replaced by T) in another version, the recognition sites for enzymes can be gained or lost, leading to different cutting patterns.

Restriction Enzymes and Palindromic Recognition

  • Restriction enzymes cut the DNA backbone at specific recognition sites (phosphodiester bonds).

  • Two main types of cuts:

    • Blunt ends: cuts straight across the DNA, yielding two fragments with no overhangs.

    • Sticky ends: cuts produce overhangs (unpaired nucleotides) at the ends, making it easier for fragments to rebind if complementary overhangs align.

  • In this transcript, an example enzyme from E. coli used in the experiment is E. coli R1 (EcoRI in standard terminology): it makes sticky-end cuts and recognizes a specific palindrome sequence.

  • Palindromic recognition sites: a sequence read 5′ to 3′ on the top strand is the same as the corresponding sequence read 5′ to 3′ on the bottom strand (which is the top strand’s reverse complement). This palindromic property is what restriction enzymes look for.

    • Example discussed: the sequence GAATTC is palindromic in the sense that the top strand reads 5′-GAATTC-3′ and the bottom strand reads 3′-CTTAAG-5′ (which is the reverse complement). The enzyme cuts between the G and the A to yield sticky ends.

  • Not all palindrome instances cause cuts at every occurrence; an SNP within a recognition site can disrupt the palindrome and prevent cutting.

  • Specific enzyme behavior mentioned in the transcript:

    • E. coli R1 (EcoRI) cuts sticky ends at GAATTC (recognition site). It reads the DNA in the 5′→3′ direction on both strands.

    • BamHI is also noted as an enzyme that produces sticky ends (the transcript mentions “BAM also does sticky end cuts”).

  • Some sequences contain repeated recognition sites; if an enzyme recognizes a site that repeats, multiple cuts can occur (as in the example with GAATTC appearing twice in a fragment, leading to two cuts and multiple fragments).

Restriction Fragment Length Polymorphisms (RFLP)

  • The acronym discussed in the notes is RFLP: Restriction Fragment Linked Polymorphisms (the term used in the lecture). A more common term is Restriction Fragment Length Polymorphism, but the concept is the same: differences in restriction fragment lengths arise from SNPs affecting restriction sites.

  • How RFLP works conceptually:

    • Digest DNA with restriction enzymes.

    • Compare the resulting fragment lengths via gel electrophoresis. Differences indicate polymorphisms (SNPs) that alter restriction sites.

  • Visualization and interpretation:

    • The DNA fragments are separated by size on a gel; shorter fragments migrate farther toward the bottom of the gel.

    • Fragment size correlates with the number of nucleotides in the fragment; larger fragments stay higher on the gel, while smaller ones run farther down.

    • The width or intensity of bands can also provide information about fragment size distribution.

  • Example of fragment patterns in a hypothetical comparison (as described in the transcript):

    • Genome 1: yields three fragments after digestion.

    • Genome 2: yields only one cut, so fewer fragments (one prominent cut site retained or a single site present).

    • Genome 4: two SNPs disrupt restriction sites in both locations, resulting in no cuts at those sites (and thus different fragment pattern).

  • How a single nucleotide change affects cutting:

    • If a SNP occurs within a recognition site, the site may no longer be recognized, leading to a failure to cut at that site.

    • This change shifts the fragment lengths observed on a gel, enabling differentiation between individuals or samples.

  • Important conceptual note: Even a single nucleotide change can have large effects on the restriction pattern, which underpins the use of restriction enzymes in genetic fingerprinting and other analyses.

Palindromes, Reading Frames, and Practical Exam Points

  • When evaluating restriction sites, ensure your understanding of the palindrome concept: the site must be read the same in the 5′→3′ direction on both strands (top and bottom strands).

  • In practical questions, you’ll be asked which end to start reading from when determining cuts and whether a site is present on both strands in the correct orientation to yield a cut.

  • Typical exam-style questions you might encounter based on this material:

    • Identify whether a given restriction enzyme will cut a sequence that contains a specific SNP.

    • Determine how many fragments would result from digestion of a sample with a restriction enzyme, given the presence or absence of recognition sites due to SNPs.

    • Explain why sticky-end cuts versus blunt-end cuts matter for fragment ligation and RFLP patterns.

Connections to Foundational Concepts and Real-World Relevance

  • The double-helix structure, base pairing, and the need to unzip DNA during replication/transcription relate to the central dogma of biology (DNA -> RNA -> Protein).

  • Purines vs pyrimidines underpin base composition and DNA/RNA structure and stability (e.g., the number of rings affects geometry and pairing dynamics).

  • SNPs explain why genetically similar individuals can look different and have different disease risks or traits; they also drive personal variations in DNA-based tests and forensic analyses.

  • Restriction enzymes are essential tools in molecular biology for cloning, genetic mapping, and diagnostic tests; understanding their recognition sites and cutting patterns is foundational for lab work.

  • RFLP and gel electrophoresis are classic techniques for comparing DNA samples, mapping genomes, and identifying polymorphisms—methods that paved the way for many modern genotyping approaches and laid groundwork for later technologies like PCR-based assays and sequencing.

Key Terms (glossary distilled from the notes)

  • DNA double helix, hydrogen bonds, phosphodiester backbone

  • Base pairing: A\leftrightarrow T, G\leftrightarrow C

  • Purines: {A,\; G}; Pyrimidines: {C,\; T,\; U}

  • One-ring vs two-ring bases; mnemonic: pyrimidines contain a “Y”; purines do not

  • Genome: ~6\times 10^{6} bp; >0.99% identity among humans

  • SNP: single nucleotide polymorphism; single-base change

  • Restriction enzymes: cut DNA at recognition sites; can produce blunt or sticky ends

  • Palindromic recognition sites: read 5′→3′ on both strands; example GAATTC (EcoRI) as a palindromic site

  • Sticky ends vs blunt ends

  • EcoRI (E. coli restriction enzyme) and BamHI (both producing sticky ends in the discussed context)

  • RFLP: Restriction Fragment Length (Linked) Polymorphisms; use of restriction patterns to distinguish DNA samples

  • Gel electrophoresis: band migration correlates with fragment size; longer fragments stay higher, shorter fragments travel farther

Notation and LaTeX references used in these notes

  • Base pairing: A\leftrightarrow T\quad\text{and}\quad G\leftrightarrow C

  • Genome size and identity: 6 \times 10^{6}\ \text{bp}, >0.99 identity

  • Palindromic recognition concept: a site like GAATTC reads the same 5′→3′ on the top strand as the reverse complement on the bottom strand

  • Directionality: one strand read 5′→3′; the other is 3′→5′ in the context of reading frames and enzyme recognition

  • Note on terminology: RFLP stands for Restriction Fragment Length Polymorphism (though the lecture text used “Restriction Fragment Linked Polymorphisms”)

Restriction enzymes cut the DNA backbone at specific recognition sites (phosphodiester bonds).

  • Two main types of cuts:

    • Blunt ends: cuts straight across the DNA, yielding two fragments with no overhangs.

    • Sticky ends: cuts produce overhangs (unpaired nucleotides) at the ends, making it easier for fragments to rebind if complementary overhangs align.

  • In this transcript, an example enzyme from E. coli used in the experiment is E. coli R1 (EcoRI in standard terminology): it makes sticky-end cuts and recognizes a specific palindrome sequence.

  • Palindromic recognition sites: a sequence read 5′ to 3′ on the top strand is the same as the corresponding sequence read 5′ to 3′ on the bottom strand (which is the top strand’s reverse complement). This palindromic property is what restriction enzymes look for.

    • Example discussed: the sequence GAATTC is palindromic in the sense that the top strand reads 5′-GAATTC-3′ and the bottom strand reads 3′-CTTAAG-5′ (which is the reverse complement). The enzyme cuts between the G and the A to yield sticky ends.

  • Not all palindrome instances cause cuts at every occurrence; an SNP within a recognition site can disrupt the palindrome and prevent cutting.

  • Specific enzyme behavior mentioned in the transcript:

    • E. coli R1 (EcoRI) cuts sticky ends at GAATTC (recognition site). It reads the DNA in the 5′→3′ direction on both strands.

    • BamHI is also noted as an enzyme that produces sticky ends (the transcript mentions “BAM also does sticky end cuts”).

  • Some sequences contain repeated recognition sites; if an enzyme recognizes a site that repeats, multiple cuts can occur (as in the example with GAATTC appearing twice in a fragment, leading to two cuts and multiple fragments).