Molecular Biology
MODULE 2 - MOLECULAR BIOLOGY
DNA and the genetic code
DNA replication
Transcription
RNA processing
Translation
Regulation of gene expression
Practical applications of molecular biology
DNA AND THE GENETIC CODE; DNA REPLICATION
Reading
Campbell: The molecular basis of inheritance
Lecture Topics
Early evidence that DNA is the carrier of genetic information
Brief review of DNA structure
Biochemistry of DNA replication
Early evidence that genes encode proteins
THE GENETIC BLUEPRINT
Contains the "instructions" or "parts list" for creating a cell
3 fundamental properties:
stores information
can be accurately copied & transmitted to progeny
has the capacity to change (mutate)
Chemical nature of the genetic blueprint?
EARLY EVIDENCE THAT DNA CARRIES GENETIC INFORMATION
Classic experiment of Frederick Griffith (1928)
Studied pneumonia caused by a bacterium: Streptococcus pneumoniae
Two classes of bacteria:
smooth (S) = virulent strain (kills mice)
rough (R) = avirulent (doesn’t kill mice)
both traits inherited
S strain makes protective polysaccharide capsule
R strain lacks capsule - killed by mouse immune system
TRANSFORMATION
Genetic information for virulence transferred from dead S cells to live R cells
Frederick Griffith found that heat-treatment didn't destroy the "transforming" activity.
Other scientists showed that the transforming activity was bacterial DNA, not proteins or other molecules
EXPERIMENT BY ALFRED HERSHEY AND MARTHA CHASE (1952)
Studied bacteriophage T2; a virus that infects E. coli
Virus injects genetic blueprint into cell and uses cell’s biosynthetic machinery to make more viruses
Question: Which molecule carries the genetic blueprint for making a phage: DNA or protein?
"WARING BLENDER" EXPERIMENT OF HERSHEY AND CHASE
Batch 1: Radioactive sulfur () in phage protein
Labeled phages infect cells.
Agitation frees outside phage parts from cells.
Centrifuged cells form a pellet.
Radioactivity (phage protein) found in liquid.
Batch 2: Radioactive phosphorus () in phage DNA
Labeled phages infect cells.
Agitation frees outside phage parts from cells.
Centrifuged cells form a pellet.
Radioactivity (phage DNA) found in pellet.
Conclusion: DNA - not protein - carries genetic information
DNA AS A CARRIER OF GENETIC INFORMATION
DNA is a polymer of only 4 building blocks (A, C, G and T)
How does this simple molecule carry genetic information?
DNA STRUCTURE DETERMINATION
James Watson, Francis Crick, Maurice Wilkins and Rosalind Franklin (first three shared Nobel prize in 1962)
Next big advance (1953): DNA structure determined using X-ray crystallography
Two anti-parallel strands of DNA from double helix.
Hydrophilic phosphate groups and deoxyribose are exposed.
N-bases face inward and interact via H-bonds (A with T; G with C)
DNA STRANDS
Each DNA strand has 5’ and 3’ ends
FUNCTION OF DNA AS THE GENETIC BLUEPRINT
How does it store information? Sequence of bases.
How does it mutate? Change sequence of bases.
How is it accurately copied (replicated) prior to cell division? Two DNA strands are complementary. The sequence of each strand contains the information needed to make a perfect copy of the other strand.
DNA REPLICATION
DNA replication is semi-conservative
BIOCHEMISTRY OF DNA REPLICATION
Very accurate and rapid (50-500 bases per second)
DNA synthesized by DNA polymerase
6 billion base pairs in human DNA, copied in a few hours
<1 mistake per billion bases!
Must ensure that DNA is copied once (and only once) per cell division
ORIGIN OF REPLICATION
Process begins at origin of replication
DNA REPLICATION DETAILS
Complementary DNA strands pulled apart by DNA helicase
DNA replication is bi-directional: two replication forks per bubble
Prokaryotic chromosomes have one origin
Eukaryotic chromosomes have many origins
PROTEINS INVOLVED IN DNA REPLICATION
Complementary DNA strands pulled apart by DNA helicase
Single-stranded DNA-binding proteins keep helix from reforming
Topoisomerases keep DNA from getting tangled up as it is unwound
DNA POLYMERASE
DNA polymerase only adds nucleotides to the 3'OH of the growing strand
Cannot initiate DNA synthesis without a "primer": nucleic acid containing a 3'OH to which next nucleotide can be added
TEMPLATE, PRIMER AND COMPLEMENTARY STRAND
Newly synthesized DNA is complementary to the template strand
Must understand the terms "primer", "template strand" and "complementary strand" and how they are related to the 5' and 3' ends of nucleic acid polymers!
DNA POLYMERASES
DNA polymerases need a template and a primer to synthesize DNA
High-energy phosphodiester bond cleaved as the next nucleotide added to 3'OH at end of growing chain. Provides the energy that drives the reaction
PRIMER NEED
If DNA polymerase needs a primer (3'OH ), how does DNA synthesis begin at the origin?
"Primase" makes a short RNA primer to start DNA synthesis.
LEADING STRAND SYNTHESIS
Allows the synthesis of the "leading strand" of DNA
LAGGING STRAND SYNTHESIS
How is the other ("lagging") strand of DNA synthesized?
Primase makes one RNA primer for the leading strand
Must make additional primers for lagging strand as helix opens
Priming occurs once for leading strand, many times for lagging strand
Lagging strand is made in short pieces of DNA = "Okazaki fragments” (100’s of nucleotides long)
LEADING AND LAGGING STRANDS
Overview showing the leading and lagging strands during DNA replication
COMPLETING LAGGING STRAND SYNTHESIS
Primers must be removed and Okazaki fragments connected to complete the synthesis of the lagging strand
Additional enzymes that carry out these steps: DNA polymerase I and DNA ligase
ENZYMES INVOLVED IN DNA REPLICATION
Primase: synthesizes RNA primer, primes once for leading strand; many times for lagging strand
DNA polymerase III: adds nucleotides to 3’OH of RNA primer
DNA polymerase I replaces the RNA primers with DNA
DNA ligase joins the Okazaki fragments to form a continuous DNA strand.
REPLACING PRIMERS
Primers must be replaced with DNA and Okazaki fragments connected to complete the synthesis of the lagging strand
PROTEINS WORKING TOGETHER
Illustrated representation of the Primase, DNA Pol III, DNA Pol I, and DNA ligase working together during synthesis of leading and lagging strands during DNA replication.
ACCURACY OF DNA REPLICATION
6 billion base pairs in human DNA, copied in a few hours
Errors cause mutations!
Replication is very accurate: <1 mistake per billion bases!
DNA polymerase “proofreads” its works and corrects errors as soon as they are made
Multiple mechanisms exist to repair environmental damage to DNA
KEY CONCEPTS ABOUT THE BIOCHEMISTRY OF DNA REPLICATION
Understand replication origins, bubbles and forks
Be able to draw nucleotides, the DNA polymer and a DNA double helix
Understand how nucleotides are added to the growing polymer
Understand the similarities and differences between the synthesis of the leading and lagging strands
Understand the functions of the following proteins: DNA helicase, topoisomerase, ssDNA-binding proteins, primase, DNA Pol III, DNA Pol I, DNA ligase
Be able to draw out replication process at the level of detail presented in lecture
MOLECULAR BIOLOGY AND GENETICS
A gene is a unit of heredity that is passed from parent to offspring
Each gene specifies one or more traits of an organism
Each gene corresponds to a specific regions of DNA
EARLY EVIDENCE SUGGESTING THAT GENES ENCODE PROTEINS
1909: Archibald Garrod published Inborn Errors of Metabolism
some inherited disorders caused by defects in an enzyme
alkaptonuria: alkapton accumulates due to lack of an enzyme, causing patients to have black urine
albinism: lack of the enzyme that makes melanin, a skin pigment
proposed that defect in a "gene" is associated with defect in an enzyme
1940's: George Beadle and Edward Tatum studied three different Neurospora strains that are unable to grow without arginine
Each strain lacked one of three enzymes required for arginine biosynthesis
Led them to propose the "one gene – one enzyme" hypothesis
GENE TO PROTEIN
How does a gene (DNA) direct the synthesis of a protein?
Genetic information of DNA stored in sequence of bases.
Sequence of bases in DNA specifies primary sequence (amino acid sequence) of protein
Cell must convert base sequence à amino acid sequence
Conversion is not direct! Occurs through RNA intermediate = mRNA (messenger RNA)
GENETIC LANGUAGE
DNA and RNA share the same "genetic language" = sequence of bases
Sequence of nucleotides in a gene (DNA) is "transcribed" into sequence of nucleotides in an RNA molecule
Sequence of nucleotides in an RNA is "translated" into sequence of amino acids in a polypeptide
CENTRAL DOGMA
transcription translation
DNA --> RNA --> proteinRNA polymerase ribosomes
Exceptions to central dogma are rare
NUCLEOTIDE SEQUENCE
4 nucleotides (A, C, G, T or U) must specify 20 amino acids
16 possible doublets (AT, TG, etc.)
64 possible triplets (ATG, CCC, etc).
Need at least 3 bases to encode all amino acids
Three bases that specify an amino acid = a CODON
The codons for each amino acid determined in the 1960's
3000 bases of DNA encode a protein with 1000 amino acids.
1 Mb (megabase) = 1000 kb (kilobase) = 1,000,000 bases.
How does a nucleotide sequence (DNA or RNA) specify the amino acid sequence of a protein?
GENETIC CODE
The genetic code is redundant: different codons can specify the same amino acid.
e.g. both GUU and GUC encode valine
The genetic code is not ambiguous: no codon specifies more than one amino acid.
can predict exact protein sequence from nucleotide sequence
The genetic code is universal: same in bacteria and humans.
very important for biotechnology industry.
READING FRAMES
In theory, any base sequence can be read in three different "frames"
Special codons must determine where "translation" begins and ends!
AUG = start (initiation) codon
UAA, UAG, UGA = stop (termination) codons; don't encode amino acids.
The following RNA would be translated as shown:
5' UUUUAUGUCUAACGAAUAAAAUAA3'
M S K E
A stretch of bases uninterrupted by termination codons is called an "open-reading frame" or ORF.
CODON TABLE
A codon table will be provided to use on exam
MESSENGER RNA (mRNA)
Anatomy of a typical messenger RNA (mRNA)
5'UTR (untranslated region)
AUG (start codon)
ORF (open reading frame)
STOP (stop codon)
3'UTR (untranslated region)Translation protein N C