Molecular Biology (BIO 99)
Lecture 1:
Operon: genes involved in a single pathway which determines the expression of the tightly regulated genes
Gene cluster and promoter at minimum
Promoter sequences
-35 box and -10 box and purine start site are most common in E. Coli
-35 box - TG box
Major binding site for sigma 70 (E. Coli sigma factor)
Consensus = TTGACA
-10 box - Pribnow box
AT rich → easier to unwind due to double bonds instead of triple bonds between nucleotides
Consensus → TATAAT
Major binding site
Purine start site (A or G)
DNA footprinting
The empty place is where the promoter region is bound to on gels
how it happens? → specific binding protein that recognizes promoter sites → protein binds to the promoter regions → DNAse (involved in footprinting) cannot cut where the protein is bound so there is no DNA in this region → electrophoresis helps us visualize the area where the DNAse cannot cut
RNA polymerase
1 in prokaryotes
3’ -OH attacks 5’ triphosphate (nucelophilic attack)
2 mg2+ cofactors
No need for RNA primer (unlike DNA polymerase) (will be a T/F question)
Coding Strand (sense) same as mRNA
Template is used with RNA poly to make mRNA and is opposite of the sequence
RNA poly subunits
Sigma - recognizes promoter sequences
ESSENTIAL for promoter recognition
Does not bind to promoter DNA on it’s own
Recognizes promoters for housekeeping genes
Beta prime - Binds DNA
Beta - Binds NTPs and interacts with sigma, polymerizes RNA (Polymerase activity)
Alpha - essential for assembly and elongation
Omega - functionality and stability
One more thing to note is (alpha)2(beta)(beta’) is the “core enzyme”
The core enzuyme + sigma cofactor is the “holoenzyme”
Initiation of transcription
Phase 1, binding - interaction between promoter and RNA pol
Formation of closed cmplx where DNA is not unwound → but then unwinds at the -10 to +2/+3
Phase 2, initiation - transcription initiation/promoter clearance.
8/9 nucleotides initially synthesized → sigma subunit released → pol leaves promoter and elongates RNA
The “scrunching” model: DNA is pulled into RNA polymerase
Rifampicin binds (beta) subunit & blocks initiation
Transcription Elongation
Highest speed among the 3: [DNA replication (is correct)], RNA transcription, Protein transcription?
Direction of transcription is 5’-3’ → 3’ end is correlated to positive supercoils(unwinding) and 5’ end is correlated to negative supercoils(rewinding)
RNA Polymerase is relatively accurate (1 err / 10000 bps)
Errors are okay, half life of mRNA is short, messages are degraded
Slower in GC rich areas (due to triple bonds)
Topoisomerases relieve supercoiling
Higher stability during elongation than initiation cmplx
14bp melted to form transcription bubble
8-9 nucleotides within bubble paired with RNA chain
Double stranded DNA opens up in front of the bubble and closes up as RNA polymerase moves along → transcription bubble extends from -12 to +2
Core RNA polymerase
N-terminal domains of the alpha subunits allow them to form dimers
Alpha subunit N-terminal domains bind to the Beta/Beta’ subunits
Beta/Beta’ subunits interact extensively with one another and together form an internal channel and the catalytic site
Beta subunit has polymerase activity
RNA polymerase does not move at a steady rate
Temporarily delayed at pause sites
Pausing may lead to arrest/termination
Arrest is important in proofreading
GreB helps rescue an arrested complex (stalled)
Transcription Termination
Rho-independent
GC-rich inverted repeat allows RNA to form stem loop -> reaches to within 7-9 nucleotides of the 3’ end of the RNA
U-rich stretch immediately after stem loop causes pausing and release
Rho-Dependent termination
Hexameric(6 rho factors) ATP dependent helicase, Rho-factor
Rho-factor releases RNA from RNA-DNA hybrid
Binds to RNA to disrupt RNA-DNA interactions
TRANSCRIPTION VS DNA REPLICATION
Similarities
DNA is a template
Synthesis of Complementary strand
Same mechanism of phosphodiester bond formation
Differences
Transcription is selective
1 strand is used as a template for transcription
!!!!Transcription does not require a primer!!!! (most important)
Transcription is more error prone (no exonuclease activity)
Lecture 2:
Why regulate gene expression?
Environmental factors: food sources change
Developmental/Differentiation
Cell specialization
Regulation
Controlling transcription initiation is most common → most efficient to regulate at beginning of pathway
Recall → sigma factor → specific to promoter sequences → sigma factor itself is transcribed to enhance gene expression in certain genes → regulates how much of a gene is transcribed
DNA binding proteins
regulators → bind to specific sites that affect how RNA polymerase binds
Blocking sites → inhibitor
Affinity for polymerase → activator
Some systems are regulated by both negative and positive controllers
Modes for negative control
Effector causes dissociation of repressor from DNA
Effector causes binding of repressor to DNA
Negative control bc it inhibits transcription (repressor)
Modes for positive control
Effector causes dissociation of of activator from DNA inhibiting transcription
Effector causes binding of activator from DNA, inducing transcription
Positive control bc activates transcription (activator)
The Lac Operon: physiological background
Lactose degraded by beta-galactosidase (Beta-gal)
Degrades but also isomerizes into allolactose
Galactoside permease for uptake of lactose
Operon Advantages/Disadvantages
Advantages:
Coordinate regulation of multiple genes using a single cis acting DNA site
Disadvantages
Individual regulation of transcription cannot occur
Lac Operon
Both positive and negative regulators
Bound repressor inhibits transcription
Absence of lactose: repressor binds
Bound activator facilitates transcription
Presence of lactose: activator binds/prevents repressor binding
Encodes 4 proteins: 1 regulatory, 3 enzymes
Operon in order
Promoter I (PI)
LacI: regulatory
PromoterL (PL)
OperatorL: regulator binding site (OL)
LacZ: B-gal (if knocked out no growth)
LacY: permease-transport (if knocked out no growth)
LacA: galactoside transacetylase
if the knockouts are combined then there will be growth bc of the diffusable protein products
Activity increases when lactose is exclusively present with no glucose
No lactose: repressor (I) active, binds operator, inhibits RNAP to make Z,Y,A
Allolactose is a side product, and an inducer (IPTG is also an inducer) (de-represses the reperssor)
Lactose present: inducer made → allolactose → binding of inducer to repressor → repressor does not bind operon
3 operators
-82 (O3) and +412 (O2) are auxiliary, +11 is main operator (O1)
Lac repression binds either O3 or O2 and always binds O1 along with either
4 LacI is nessecary to inhibit → dimerization at O1, and dimerization at O2/O3 → Dimers form another dimer → DNA loop
Lac operator is a palindrome → lacI is a dimeric protein and palindromic sequences are recognized by dimeric proteins
Lac repression involves DNA looping
Genetic perspective on lac operon:
Create lac- mutuants → lose a function or create a mechanism?
Complementation analysis
Lecture 3:
Cis vs Trans gene regulation
Cis acting elements → DNA sequences in vicinity of genes
Trans acting factors → diffusible protein factors that bind to DNA sequences
Note that cis/trans do not illustrate the bond formation
Regulatory mutants
i- → mutant nonfunctional repressor protein
I^S → super repressor: cannot be inactivated by inducer
O^C → operator constitutive: cannot be bound by repressor
p- → nonfunctional promoter
CAP protein & Cap site → glucose sensitive switch
Levels of ec glucose and intracellular cAMP → inversely related
High glucose levels outside cell create low cAMP levels within cell
CAP/cAMP = gene activator
CAP = CRP = “cAMP receptor protein”
Bind DNA when activated by cAMP
Binds lac operon promoter for lacZYA
Note that Lac Operon is ONLY active under Lactose=present and glucose=absent conditions
Catabolic vs Anabolic operons
Lac operon encodes catabolic enzymes
Catabolic operons generally regulated through induction
Trp operon encodes anabolic enzymes
Anabolic operons generally regulated through repression
Operons involved in AA synthesis are tightly regulated
AAs are expensive
Aa levels are low → then expressed
Repressed when abundant
Tryptophan Operon:
2 modes of regulation
Trp repression
Transcription attenuation
Operon Order
trpP (promoter)
trpO (operator)
trpL
trpE
trpD
trpC
Repression:
trpR requires trp as a corepressor → tryptophan levels low → no corepressor → repressor does not bind to aporepressor → no binding to operator → no repression → Expression of trp operon
trpR + trp(levels are high) → trp binds to aporepressor → conformational change in aporepressor → binds operator → inhibits expression
!!!!Attenuation!!!!
Fine-tuning of synthesis
Low trp → full mRNA made
High trp → leader sequence made (trpL) exclusively
4 important sequences in leader sequence
Alternative base pairing between them results in different results
1-2 (which leads to 3-4 binding), 3-4 lead to 3-4 terminators stem loop
2-3 leads to antiterminator stem loop (prevents 3-4)
If 3-4 pair: structure forms → attenuator → acts as a transcription terminator
2-3 pair → 3-4 cannot pair → attenuator not formed → 2-3 pair has no effect
Ribosome stalls at sequence 1 when trp is low → leads to 2-3 binding
leader region is completely translated in 3-4 binding, trp is high so passes through sequence 1 quickly leading to 3-4 binding
When leader region is completely translated → 2-3 reforms after passing through stem loop created by 2-3
Lecture 4:
Prokaryotic DNA vs Eukaryotic DNA:
Information density: prokaryotes greater than eukaryotes
More genes/bps in prokaryotes
Eukaryotes have repeating DNAs → prokaryotes have unique DNA
Genome association w/ protein (Chromatin)
Prokaryotic genomes loosely associated
Eukaryotic genomes tightly associated with histones and chromosomal proteins: chromatin
1% of human genome encodes for proteins
As organism complexity increases the genes per base pair decreases
Eukaryotes are incredibly complex organisms with very large non-coding regions
Renaturation kinetics/Reassociation kinetics
Help us understand why DNA is repetitive
Eukaryotes need to package data
Compacting DNA vs reliable access to DNA → chromatin is solution
Proteins include specialized structural proteins/enzymes
30 nm chromatin fiber organized in loops that can be individually opened (looped domains)
Sequences will have plateaus depending on their repetitiveness in kinetics chart
Very repetitive sequences will anneal fastest
Very unique sequences will anneal slowest
Prokaryotes have 1 jump in transition
Eukaryotes may have multiple jumps
(3 classes/3 rises, 4 rises/4 classes, etc.)
RNA polymerase in Eukaryotes and Prokaryotes
3 eukaryotic RNA pol vs 1 prokaryotic RNA pol
Eukaryotic RNA pols: rRNA, mRNA, and tRNA synthesizers
Eukaryotic RNA polymerase requires multiple transcription factors while prokaryotic require one or two at most
Eukaryotes have post-transcriptional processing that involves capping, splicing, and polyadenylating mRNA
Prokaryotic DNA generally does not contain introns
Eukaryotic RNA pol requires factors to modify chromatin/looped domains to access genes
Bacterial genes are not packaged like eukaryotic DNA so it can be easily accessed by RNA polymerase
Eukaryotic RNA polymerases
Polymerase:
1 → resides in the nucleolus (innermost portion of nucleus) → transcribes pre-rRNA typically
2 → Nucleoplasm → pre-mRNA and some snRNA
3 → Nucleoplasm → tRNA, some rRNA, some snRNA(spliceosome), and signal recognition RNA 7SL RNA
Eukaryotic core promoter motifs
TATA box
TATAXAX consensus sequence
25-30 bp upstream of transcription start site
Initiator element (Inr)
Overlaps with transcription initiation site
DPE
Downstream promoter element
Extends from about +28 to +34
TFII recognition element (BRE)
Typically the order seen is the BRE → TATA Box → Inr → DPE
General transcription factors (GTFs) form the preinitiation complex (PIC)
DAB F Pol EH (District Attorney Beats Four Policemen Eating Hamburgers)
[TFII (prefix for each)] D→A→B→F-Pol(joins together)→E→H
^^assembly order^^
TF = transcription factor
II = RNAP II
TFIID
Made up of TBP(TATA-binding protein) and TAFs (TBP-associated factors)
Very large
First to bind
Binds DNA in the minor groove
TFIIA
Recognizes core promoter
TFIIB
Recognizes core promoter
TFIIF-Pol
F targets the Polymerase to the promoter
TFIIE
Modulator of helicase
Binds after Pol/TFIIF binds to preinitiation complex
2 different subunits/both needed to stimulate transcription
TFIIH
Helicase
9 subunits
DNA helicase activity/ATPase activity
Kinase activity → phosphorylation of CTD of the large subunit of RNAP
CTD protein Kinase
Xeroderma pigmentosum
Mutations in TFIIH subunits lead to extreme light sensitivity and ultimately cancer
May lead to death in childhood
Cockayne syndrome is the same mutation
REVIEW OF EUKARYOTIC TRANSCRIPTION INITIATION
Low density of coding information
Large amounts of introns
3 RNA polymerases
Pol I and III transcribe rRNA and small RNAs have unique promoter requirements
Pol II transcribes mRNA which leads to gene expression
RNA pol II promoters assemble GTFs (General transcription factors)
Preinitiation complex(PIC) consists of more than 30 individual proteins
PIC does initiate transcription at very low activity
CTD (Carboxyl terminal domain associated with 3’ end of reading frame for mRNA) of large subunit of RNAP helps transition from initiation to elongation
CTD contains many tandem repeats of the heptapeptide Tyr-Ser-Pro-Thr-Ser-Pro-Ser
Number of repeats ranges from 26 in yeast to 52 in humans
Repeats can be phosphorylated
Phosphorylation state differs at different stages of transcription
Ser 2 and Ser 5 are imporant
Transcription Elongation
Largest subunits of RNA pol I, II, and III as well as E. Coli RNA polymerase have many homologies
Pol II is unique in having a CTD with heptapeptide repeats
CTD is indispensable → deletion mutants are lethal in yeast
RNA pol II CTDs have to be unphosphorylated for the PIC to form
TFIIH has to phosphorylate RNA pol II CTDs for RNA elongation to occur
Serine 2 and serine 5 are very important in CTD
RNA pol II is phosphorylated at ser5 to initiate elongation
RNA pol II is phosphorylated at ser2 after bp +50 during elongation
RNA pol II
O = phosphorylated → elongation ensues
A = unphosphorylated → no elongation
When it is phosphorylated DAB is then kicked off
2 proteins regulate elongation as well
NELF → Negative elongation factor
DSIF → DRB-Sensitivity-Inducing-Factor
DRB is an inhibitor of CDK9 which is a component of the positive transcription elongation factor (P-TEFb)
ATP then is used to knock off NELF and DSIF
then elongation can be induced by the P-TEFb
Transcription Elongation factors
Fork loop 1
prevents premature unwinding
rudder
prevents the DNA to rebind to mRNA
lid
wedge and guide the incoming DNA
bridge helix
acts as a ratchet (circular motion in one way)
more than 100 proteins associated with transcription of RNA polymerase II
Lecture 5:
mRNA processing (3 steps)
5’ Cap needs to be added
Splicing the mRNA
Cleavage/Addition of the Poly-A tail
5’ Cap (pay attention to enzymes below)
Function
Protects mRNA from nucleases
Distinguishes mRNA from other types of RNA
mRNA export
Translation initiation (Ribosome binds to the 5’ cap)
How its made
5’-triphosphate Step 1
H2O → H2PO4
The 5’ phosphate is removed and now we have a diphosphate at the end
Guanylyltransferase Step 2
GTP → PPi
Guanosine is added
N7G-methyltransferase Step 3
S-adenosylmethionine → S-adenosylhomocysteine
Guanine is methylated
Capping enzyme is recruited by the CTD of RNA pol II
CTD must be phosphorylated on Ser-5 (negatively charged attracts positive enzyme)
CE = Capping enzyme
Bifunctional
RNA triphosphatase and guanylyltransferase
mRNA Splicing
Function
Allows many proteins to be produced by 1 gene
mRNA export
Translation importance (nonsense mediated decay)
Introns are removed
Alternative Splicing
pre-mRNA can be spliced differently to produce many different mRNAs
Errors may lead to muscular dystrophy
4 elements for splicing
GU-rich sequence is the 5’ splice site
AG sequence 3’ splice site
Branching nucleotide
Adenine
Most important
Pyrimidine rich tract
Splicing steps
2’-OH of adenine is the nucleophile
The Branching Nucleotide’s 2’-OH attacks the 5’ GU
3’-OH of the 5’ GU that was removed attacks the AG at the 3’ splice site
Exon is separated from intron
Spliceosome
Promotes the splicing and is made up of snRNA
Transcription Termination
Cleavage at poly(A) site
Addition of the poly(A) tail at 3’ end
Transcription termination downstream from cleavage site
Functions of poly(A) tail
Protects from exonuclease activity
Important for transport of mRNA
Important for translation
Allows for isolation of mRNA in a lab
Antitermination model for termination vs Torpedo model of termination
Antitermination is less supported than torpedo model
(likely incorrect) Antitermination proposed that there was a antiterminator that attached to the RNAP II and prevented the release of the enzyme until hitting terminator factors that would elicit the release of the RNAP from the DNA
(likely correct) RNAP is stalling and slowing down after making polyA tail → RAT1/Xm2 attacks the poly(A) tail → ends the termination after degrading rest of mRNA (torpedo model) (enzymes involved are important)
mRNA export requires GTP hydrolysis
mRNA out needs Exportin and Ran + 1 GTP
mRNA in needs importin and Ran + 1 GTP
Eukaryotic Transcription Regulation
Sequence specific transcription factors
Promoter elements
Eukaryotic promoters are OFF in the absence of regulatory factors
Bacterial promoters have a basal/low transcription rate
Enhancers
Activating and repressing mechanisms
Mediators
Chromatin remodeling
Allows transcription → loops of DNA are unpacked in order for transcription to initiate
Histones: basic proteins that package and order eukaryotic DNA into units called nucleosomes
Euchromatin → loosely packed DNA (more expressed
Heterochromatin → tightly packed chromatin
Transcribing a gene →
HATs encourage transcription by decreasing the affinity of nucleosomes → leading to less tightly bound DNA to histones
HDACs increase the affinity of histones for DNA → more tightly bound to each other
Eukaryotic promoters must be activated → RNA polymerase have little to no affinity for promoters without additional factors
Combinatorial control: specific combination of transcription factors must be bound at the promoter in order to express a specific gene → large eukaryotic promoters can bind many transcription factors
General TFs bind at core promoter
Transcriptional activators that bind DNA and co-activators which bind the activators bind at regulatory sequences both upstream and downstream
Cis vs Trans activiating elements (not to be confused with the chemical terminology of cis- and trans- conformations)
Cis-repsonsive elements are elements within the DNA sequence that are not the promoter but bind transcriptional factors
Transcription factors or (trans-acting regulators): are proteins that bind cis-responsive elements
Transcription factors
bind to enhancers(DNA) and mediators(proteins)
Contribute to the chemical modification of the PIC
Stimulate elongation of RNAPII
Act on the level of chromatin
DNA binding and activation/repression domains:
Transcription factors have a bimodal composition
One domain recognizes a specific DNA motif or “DNA binding domain”
Zinc-finger sequence motif
Binds to major groove
Complex formation between 4 cysteine/histidine residues and a zinc ion
Helix-turn-helix motif
Binds to major groove
Form dimers (if one of the proteins is mutated it will not function properly)
2 helices → one recognizes and fits into the major groove
Leucine Zipper
Leucines are spaced 7 proteins apart
Dimers as well
Second domain affects transcription activation
This can lead to many more combinations of Transcription factors such as if there are 8 transcription factors you can have 64 combinations
Eukaryotic promoters:
Sigma factor helps bind RNAP to binding sites in prokaryotes
eukaryotic promoters are
Sequences bound by PIC (Core promoter)
binding sites for transcriptional activators (regulatory promoters)
Interacts with a specific target sequence which is sometimes close to the transcription start site
Transcription is controlled by a promoter and an enhancer
Separation from enhancer may be in several kbps to the promoter VERY FAR
Enhancer
Location and orientation of enhancer is independent and functions at a distance
Enhancers can be upstream or downstream and still provide the same function
DNA looping occurs between regions bound to activator proteins
DNA binding domain on an enhancer is bound by an activator which then binds to the PIC which is bound to the core promoters (TATA,Inr,DPE)
Enhancers and Silencers
Activators(protein) bind to enhancer sequences
Determines which genes are switched on and increase the speed of transcription
Repressors(protein) bind to silencer sequences
Interferes with the functioning of activators and slows transcription
May either interfere with the binding site of the activator (the enhancer sequence) (competitive DNA binding)
may bind to the activator as it may “mask the activation surface” of the PIC
May bind to the GTFs in the PIC and “directly interfere with the binding of the activator” on an enhancer sequence to the PIC
Enhancers(cis element) need a mediator(trans element)
Specifically interacts with the CTD of the large subunit of RNA pol II
CTD attracts the attachment of many additional factors including
Termination factors, splicing factors, elongation factors, mediators
CTD acts as an assembly line for tools needed for promoter clearance and RNA processing which is coupled to the elongation process
mRNA might get fed through this line of factors which are bound to the CTD
Binding of an activator to an enhancer recruits RNAP II through mediator/RNAP complex
When many enhancers aim to promote transcription their effect is synergistic
I.e. 1 activator may produce a synergistic effect of 1 unit
4 activators may produce a synergistic effect of 500 units
Insulators block activation by enhancers, or block repression by silencers.
Lecture 6:
Chromatin packaging hierarchy
Nucleosome forms (DNA + Histones)
30 nm fiber (coiled nucleosomes)
Nuclear scaffolding w/ looped domains
Metaphase chromosome (mitotic)
Histones made up of octamer (2xH2A,H2B,H3,H4)
146 bps wrap around a histone (2 turns)
Histones contain many basic amino acids (Lysine & Arginine)
Histones as a result are positively charged
DNA is negatively charged
Strong electrostatic interactions
DNA packaging + transcription
Heterochromatin vs Euchromatin
Euchromatin = histones on the outside of the plane which allows for transcription
Heterochromatin = histones are tightly wrapped and adjacent limiting transcription
How to alter DNA packaging
Chromatin Remodeling
Chromatin remodeling complex exposes promoter that may be bound to nucleosome
Brings out the promoter to the string
Histone tail modifications
Histone tails can be modified to create different changes to the histone itself
Histone modifications influence transcriptional activity
Acetylation
HATs (histone acetyltransferase)
Lys residues at N-terminus
Enhances transcription by destabilizing nucleosomes
Deacetylation
HDACs (histone deacetylases)
Stabilizes compact chromatin structures
Represses transcription
Methylation
Lys and Arg residues
Repression
Phosphorylation
Ser residues
Activation
Ubiquitination
Lys residues
Activation
Remodeling and Modification usually work together
Chromatin Remodeling Complex is SWI/SNF
Methylation
Chromodomain
Allows proteins to bind methylated histones
Chromoshadow domain
Allows HP1 to bind to other HP1 Protiens
Epigenetic code
Covalent changes to histones and DNA create changes that may alter gene expression and result in being read like the genetic code
Transcription Regulation and Cancer
Cofactors
Binds transcription factors without making DNA contact
Can be super-activators (cofactor)
Can repress the activity of the TFs (corepressor)
E2F → constitiutive transcription factor in the context of certain cellular genes
Binds the cofactor RB and loses activation function
Cell cycle is regulated by transcription factors and co-repressors
S phase = replication of chromosomal DNA
G1 phase = no replication of chromosomal DNA
G1-S transition: express proteins needed for replication of DNA like DNA polymerase
G1 co-repressor(RB) blocks E2F function
Regulated phosphorylation of RB leads to it’s removal and reactivation of E2F → leading to S phase
In many cancers the RB gene is mutated and cells permanently go from G1 to S without stalling
RB mutations may lead to cancer
RB originally found mutated in cancers of the retina
Some of the most frequent mutations leading to cancer
Transcription regulation is central to carcinogenesis
Normal cells have cdk2 to phosphorylate RB and remove it from E2F to have typical DNA replication
Cdk2 is negatively controlled by a different factor (p21CIP)
p21 transcription is activated by p53
p53 availability determines E2F repression by RB
p53 is a tumor suppressor → loss of function/mutation would potentially lead to uncontrolled tumor proliferation
p53 available(up) → p21 up → cdk2 down → RB up → E2F down → G1 does not go to S phase (inverse is true for p53 mutation)
Many viruses target p53 and RB and cause cancer (papillomaviruses, polyomaviruses, and adenoviruses)
siRNA
mRNA + tRNA and →
small stretches of RNA which complements part of mRNA (siRNA)
Double stranded RNA is responsible for lowering RNA levels
RNAi → RNA interference
Long ds-RNA (double stranded)
RISC(big complex that contains dicer and argonaute) → Dicer cleaves ds-RNA and siRNA
Protein breaks strands into single strands
Delivers to RNA
Transcript is degraded
Small RNAs / RNA interference
Previously seen that small RNAs exist in the cell
Problem → small RNAs found in the cell siRNA or degradation products of other larger RNAs
Are they natural?
Yes → are they degradation products or siRNA tho
Northern Blot → sequence of small RNA
Sequence found in genomes are very similar sizes
NOT random degradation
Endogenous RNA → microRNAs(miRNA)
21-23 nucleotides
Primary mRNA transcribed by RNA pol II
Different parts of RNA interference
RISC → RNA interference is done by this complex
Dicer → cleaves the DS RNA
Argonaute → cleaves between siRNA and mRNA
Drosha processes microRNA
pre-miRNA then is loaded onto RISC
Processed by RISC
siRNA vs miRNA
siRNA is artificial (Gene silencing)
miRNA is encoded by genome (RNAi)
miRNA leads to translation inhibition
siRNA leads to translation inhibition and/or degradation
CRISPR-Cas
Clustered Regularly Interspaced Short Palindromic Repeats
Used by bacteria to fight viruses
3 steps
Acquisition - cas locus → binds to virus DNA → GGG sequence → cuts 20 bases upstream
Expression - bait the DNA from the virus
Interference - Cas protein bind to CRISPR DNA
Similar to siRNA but the difference is it involves DNA
CRISPR and Eukaryotic cells → possible to create specific genome modifications
Required are Cas9(nuclease), Gene specific CRISPR RNA (crRNA), and tracr RNA → links crRNA to Cas9
3 Genome editing techniques
Zinc finger (motif)
Binds DNA and cuts with endonuclease (FolkI)
Issue is that zinc finger has to be remade each time
TALEN (helix-turn-helix)
Binds DNA and cuts with endonuclease (FokI)
Issue is 9 proteins needed to be bound
Cas9/CRISPR
gRNA binds DNA, cuts by an externally added endonuclease (Cas9)
Allows highly specific binding and linked to trcrRNA
BEST strategy
gRNA??? → guide RNA → crispr RNA
RNAi inhibits an RNA, CRISPR/Cas9 will inhibit DNA (major difference)
Human collections generated by Cas9/CRISPR include genes essential for cancer cells and genes important for resistance to chemotherapy drugs
Lecture 7:
Translation by ribosome
Proteins differ from nucleic acids:
20 amino acids vs 4 nucleic acids
Large variety of functional groups
Accelerate a multitude of chemical reactions
well-defined tertiary structure (shapes)
4 nucleic acids ^ 3 codon slots → 64 combinations of nucleic acids
tRNA
Transfer RNA
Ribosome
Factory to make
How is code read?
Unpunctuated code → deletions of 3 nucleotides would restore the reading frame
The two wrong proposals were overlapping code and punctuated code
Deletions
1-2 frameshift mutation
3 nucleotides = removal of 1 aa
Insertion of 3 nts → insertion of 1 aa
Change of 1 nt results in a missense or nonsense mutation
Filter experiment
Synthetic mRNA that codes for certain amino acids added to solution
Filter binding assay
Filter contains mRNA and only the corresponding tRNA would attach to filter → aa that is radioactive would then be stuck on filter
There are variants to the genetic code
UGA = Trp, AUA = Met, AGA = Stop in mitochondria
Genetic code
Non-overlapping, no spacers
Almost universal
Highly degenerate = many aas are specified by two or more codons
Unambiguous = codons specify ONE aa
Crick’s adaptor hypothesis
tRNA
AA – Adaptor(tRNA) – Nucleic acitd
tRNA cloverleaf secondary structure
Amino-acid arm - conserved CCAOH → attaches to AA
Anticodon arm - read antiparallel to mRNA
D-arm
Extra arm
TψC arm
7-15% of tRNAs contain modified nucleosides
Post Transcriptionally modified
Adenosine at 5’ anticodon changes to I (Inosine) → can base pair with A, U, C → “Wobble theory” / third base wobble
tRNA ala found to bind to GCA, GCC, and GCU → Inosine is the 3rd base
Inosine is highly ambiguous
Allows for more conservation of energy
32 tRNAs for 61 sense codons (3 are stop)
Non-canonical base pairing in one base provides weaker interaction between anticodon and codon → higher rate of protein synthesis
tRNA folds into a L shaped 3D structure
Active sites include anticodon and amino acids and they are maximally seperated
The anticodon stem and acceptor stem form double helices
Aminoacyl-tRNA synthetase
Matches tRNA vs AA
AA attacks ATP → becomes adenylated
Aminoacyl-AMP formed → 2’ OH of tRNA attacks C which releases AMP (which is a great leaving group)
How do aminoacyl-tRNA synthetases know they have the correct AA attached?
AA too large → does not fit
AA too small → fits → proofreading site within enzyme → removed through hydrolysis
Lecture 8
Ribosome History
Who won the ribosome race?
Tom Steitz
Who first crystallized the ribosome?
Ada Yonath
Ribosome
4 binding sites
Large subunit
E - Exit
P - Peptide
A - Amino
Small subunit
mRNA binding site
2/3 of the ribosome is RNA and 1/3 is protein
Proteins are just there to stabilize the RNA
Translation
aa + ATP + tRNA → aa-tRNA + AMP (hydrolysis of ATP = loaded aminoacyl tRNA)
Bacterial initiation utilizes fMet (formyl-Methionine) and initiator tRNA
fMet is loaded by 2 enzymes → Methionyl-tRNA synethetase & Methionyl-tRNA formyltransferase
Difference between fMet and Met tRNA is that there is a mismatch between AC which provides a kink !!!DIFFERENCE IS IMPORTANT!!!
fMet-tRNA to AUG (loads to P site) → regular methionine would load to A site
fMet-tRNA binds to AUG
Prokaryotic translation
3 initiation factors and GTP hydrolysis
30S subunit (small subunit)
IF-1 Bound to the amino site
IF-3 bound to nowhere (in relation to APE) (E site bound during initiation)
Order bound is:
IF-1 → IF-3 → mRNA → then IF-2 binds (explained below)
Shine-Dalgarno (SD u can write SD so it’s prolly gonna be on the exam) sequence complementary to the 16S rRNA
16S rRNA is within the 30S subunit (small)
Made up of other proteins as well
fMet-tRNA is accompanied by the IF-2 that binds to IF-1 which is within the A site
So what happens in order now is after the IF-1 binds to the A site and IF-3 binds to the E site → mRNA is now fed through the ribosome in which the SD sequence complements to the 16S rRNA → and the IF-1 is now within the A site → IF-2 binds to the IF-1 within the A site and carries the fMet-tRNA along with it.
After all this happens the 50S subunit binds to the 30S subunit making the 70S ribosome and knocks the IFs off the small subunit
Something to note here is that fMet still remains on the tRNA that now resides within the P site
To clarify once more:
IF-3 is nowhere until initiation
IF-1 is in the A site
IF-2 is on top of IF-1 (bc it binds later)
fMet-tRNA is in the P site (comes in with IF-2)
Riboswitch inhibition → feedback inhibition
If an end product M is present → loop formed at AUG → ribosome falls off → translation is OFF
Eukaryotic Translation Initiation
Many eIFs → no fMet
Eukaryotic initiator Met-tRNA has unique features
12 different IFs
Eukaryotic translation steps
Small subunit complexes with eIF1, eIFA, eIF3
Initiator Met-tRNA w/ eIF2 and Met binds to complex w/ eIF5B-GTP
mRNA 5’ cap binds to eIF4F
mRNA-eIF4F binds to the preinitiation complex
The Kozak Sequence establishes the reading frame → binds to eIF4F
polysome formation involves 5’ cap and poly(A) tail
rRNA binds to mRNA through base pairing
Scans till AUG
Translation Initation complete
Locations of each of the eIFs is more important though
eIF3 is bound to nowhere (on the opposite side of the ribosome of the APE site)
eIF1 is bound to E site
eIF2 is bound EXCLUSIVELY to the Met-tRNA (does not touch the ribosome)
eIF1A is bound to the A site
eIF5B is bound to eIF1A which is bound to the A site
Met-tRNA is bound to the P site
eIF4F binds to the mRNA’s 5’ cap → which then binds to the ribosomal subunit (does not bind to a site)
eIF2 mutation leads to huntington’s disease
Elongation
Elongation factors are needed for chain elongation
Steps
1. aminoacyl-tRNA binds to A site
2. tRNA moves over to the P site and a new peptide bond is formed
3. ejection of spent tRNA from the E site
Ribosome catalytic base is an Adenine
prokaryotes:
fMet-tRNA binds to the P site and and GTP-Tu binds with the second aminoacyl-tRNA at the A site → GTP hydrolysis → Tu falls of and forms Ts-Tu complex → GTP then binds to another Tu to continue the process
Peptide bond is formed → aa chain is at the A site
EF-G-GTP binds the A site
Translation requires GTP hydrolysis
Puromycin binds to A-site
Transcription if from 5’ to 3’, ribosomes move 5’ → 3’
Termination
Protein is made → Stop codon
A site scans for stop codon → binds RF2 (release factor)
binds to A SITE
GTP hydrolysis is involved in termination
hydrolyzes polypeptide
RFs
RF1 recognizes UAG or UAA stop codon
RF2 regonzises UGA or UAA stop codon
RF3 - stimulated the rate of peptide release by RF1/RF2 but does not act independently
difference is not tested
eRFI (human translation release factor)
looks like a tRNA w/ and L shape
eRFI, RF1, and RF2 all bind to the A site
tmRNA
compare it to tRNAAla
ribosome rescue pathway (method 1)
tmRNA binds to A site
Ala on tmRNA attacks the polypeptide chain
frees the ribosome from the polypeptide chain
Ribosome Recycling (method 2)
RRF binds !!!A site!!! (just pay attention to A site bascially w/ termination)
EF-G-GDP binds to A site
pushes RRF to P site
Then IF-3 comes in to the E site and kicks RRF and EF-G off
Lecture 1:
Operon: genes involved in a single pathway which determines the expression of the tightly regulated genes
Gene cluster and promoter at minimum
Promoter sequences
-35 box and -10 box and purine start site are most common in E. Coli
-35 box - TG box
Major binding site for sigma 70 (E. Coli sigma factor)
Consensus = TTGACA
-10 box - Pribnow box
AT rich → easier to unwind due to double bonds instead of triple bonds between nucleotides
Consensus → TATAAT
Major binding site
Purine start site (A or G)
DNA footprinting
The empty place is where the promoter region is bound to on gels
how it happens? → specific binding protein that recognizes promoter sites → protein binds to the promoter regions → DNAse (involved in footprinting) cannot cut where the protein is bound so there is no DNA in this region → electrophoresis helps us visualize the area where the DNAse cannot cut
RNA polymerase
1 in prokaryotes
3’ -OH attacks 5’ triphosphate (nucelophilic attack)
2 mg2+ cofactors
No need for RNA primer (unlike DNA polymerase) (will be a T/F question)
Coding Strand (sense) same as mRNA
Template is used with RNA poly to make mRNA and is opposite of the sequence
RNA poly subunits
Sigma - recognizes promoter sequences
ESSENTIAL for promoter recognition
Does not bind to promoter DNA on it’s own
Recognizes promoters for housekeeping genes
Beta prime - Binds DNA
Beta - Binds NTPs and interacts with sigma, polymerizes RNA (Polymerase activity)
Alpha - essential for assembly and elongation
Omega - functionality and stability
One more thing to note is (alpha)2(beta)(beta’) is the “core enzyme”
The core enzuyme + sigma cofactor is the “holoenzyme”
Initiation of transcription
Phase 1, binding - interaction between promoter and RNA pol
Formation of closed cmplx where DNA is not unwound → but then unwinds at the -10 to +2/+3
Phase 2, initiation - transcription initiation/promoter clearance.
8/9 nucleotides initially synthesized → sigma subunit released → pol leaves promoter and elongates RNA
The “scrunching” model: DNA is pulled into RNA polymerase
Rifampicin binds (beta) subunit & blocks initiation
Transcription Elongation
Highest speed among the 3: [DNA replication (is correct)], RNA transcription, Protein transcription?
Direction of transcription is 5’-3’ → 3’ end is correlated to positive supercoils(unwinding) and 5’ end is correlated to negative supercoils(rewinding)
RNA Polymerase is relatively accurate (1 err / 10000 bps)
Errors are okay, half life of mRNA is short, messages are degraded
Slower in GC rich areas (due to triple bonds)
Topoisomerases relieve supercoiling
Higher stability during elongation than initiation cmplx
14bp melted to form transcription bubble
8-9 nucleotides within bubble paired with RNA chain
Double stranded DNA opens up in front of the bubble and closes up as RNA polymerase moves along → transcription bubble extends from -12 to +2
Core RNA polymerase
N-terminal domains of the alpha subunits allow them to form dimers
Alpha subunit N-terminal domains bind to the Beta/Beta’ subunits
Beta/Beta’ subunits interact extensively with one another and together form an internal channel and the catalytic site
Beta subunit has polymerase activity
RNA polymerase does not move at a steady rate
Temporarily delayed at pause sites
Pausing may lead to arrest/termination
Arrest is important in proofreading
GreB helps rescue an arrested complex (stalled)
Transcription Termination
Rho-independent
GC-rich inverted repeat allows RNA to form stem loop -> reaches to within 7-9 nucleotides of the 3’ end of the RNA
U-rich stretch immediately after stem loop causes pausing and release
Rho-Dependent termination
Hexameric(6 rho factors) ATP dependent helicase, Rho-factor
Rho-factor releases RNA from RNA-DNA hybrid
Binds to RNA to disrupt RNA-DNA interactions
TRANSCRIPTION VS DNA REPLICATION
Similarities
DNA is a template
Synthesis of Complementary strand
Same mechanism of phosphodiester bond formation
Differences
Transcription is selective
1 strand is used as a template for transcription
!!!!Transcription does not require a primer!!!! (most important)
Transcription is more error prone (no exonuclease activity)
Lecture 2:
Why regulate gene expression?
Environmental factors: food sources change
Developmental/Differentiation
Cell specialization
Regulation
Controlling transcription initiation is most common → most efficient to regulate at beginning of pathway
Recall → sigma factor → specific to promoter sequences → sigma factor itself is transcribed to enhance gene expression in certain genes → regulates how much of a gene is transcribed
DNA binding proteins
regulators → bind to specific sites that affect how RNA polymerase binds
Blocking sites → inhibitor
Affinity for polymerase → activator
Some systems are regulated by both negative and positive controllers
Modes for negative control
Effector causes dissociation of repressor from DNA
Effector causes binding of repressor to DNA
Negative control bc it inhibits transcription (repressor)
Modes for positive control
Effector causes dissociation of of activator from DNA inhibiting transcription
Effector causes binding of activator from DNA, inducing transcription
Positive control bc activates transcription (activator)
The Lac Operon: physiological background
Lactose degraded by beta-galactosidase (Beta-gal)
Degrades but also isomerizes into allolactose
Galactoside permease for uptake of lactose
Operon Advantages/Disadvantages
Advantages:
Coordinate regulation of multiple genes using a single cis acting DNA site
Disadvantages
Individual regulation of transcription cannot occur
Lac Operon
Both positive and negative regulators
Bound repressor inhibits transcription
Absence of lactose: repressor binds
Bound activator facilitates transcription
Presence of lactose: activator binds/prevents repressor binding
Encodes 4 proteins: 1 regulatory, 3 enzymes
Operon in order
Promoter I (PI)
LacI: regulatory
PromoterL (PL)
OperatorL: regulator binding site (OL)
LacZ: B-gal (if knocked out no growth)
LacY: permease-transport (if knocked out no growth)
LacA: galactoside transacetylase
if the knockouts are combined then there will be growth bc of the diffusable protein products
Activity increases when lactose is exclusively present with no glucose
No lactose: repressor (I) active, binds operator, inhibits RNAP to make Z,Y,A
Allolactose is a side product, and an inducer (IPTG is also an inducer) (de-represses the reperssor)
Lactose present: inducer made → allolactose → binding of inducer to repressor → repressor does not bind operon
3 operators
-82 (O3) and +412 (O2) are auxiliary, +11 is main operator (O1)
Lac repression binds either O3 or O2 and always binds O1 along with either
4 LacI is nessecary to inhibit → dimerization at O1, and dimerization at O2/O3 → Dimers form another dimer → DNA loop
Lac operator is a palindrome → lacI is a dimeric protein and palindromic sequences are recognized by dimeric proteins
Lac repression involves DNA looping
Genetic perspective on lac operon:
Create lac- mutuants → lose a function or create a mechanism?
Complementation analysis
Lecture 3:
Cis vs Trans gene regulation
Cis acting elements → DNA sequences in vicinity of genes
Trans acting factors → diffusible protein factors that bind to DNA sequences
Note that cis/trans do not illustrate the bond formation
Regulatory mutants
i- → mutant nonfunctional repressor protein
I^S → super repressor: cannot be inactivated by inducer
O^C → operator constitutive: cannot be bound by repressor
p- → nonfunctional promoter
CAP protein & Cap site → glucose sensitive switch
Levels of ec glucose and intracellular cAMP → inversely related
High glucose levels outside cell create low cAMP levels within cell
CAP/cAMP = gene activator
CAP = CRP = “cAMP receptor protein”
Bind DNA when activated by cAMP
Binds lac operon promoter for lacZYA
Note that Lac Operon is ONLY active under Lactose=present and glucose=absent conditions
Catabolic vs Anabolic operons
Lac operon encodes catabolic enzymes
Catabolic operons generally regulated through induction
Trp operon encodes anabolic enzymes
Anabolic operons generally regulated through repression
Operons involved in AA synthesis are tightly regulated
AAs are expensive
Aa levels are low → then expressed
Repressed when abundant
Tryptophan Operon:
2 modes of regulation
Trp repression
Transcription attenuation
Operon Order
trpP (promoter)
trpO (operator)
trpL
trpE
trpD
trpC
Repression:
trpR requires trp as a corepressor → tryptophan levels low → no corepressor → repressor does not bind to aporepressor → no binding to operator → no repression → Expression of trp operon
trpR + trp(levels are high) → trp binds to aporepressor → conformational change in aporepressor → binds operator → inhibits expression
!!!!Attenuation!!!!
Fine-tuning of synthesis
Low trp → full mRNA made
High trp → leader sequence made (trpL) exclusively
4 important sequences in leader sequence
Alternative base pairing between them results in different results
1-2 (which leads to 3-4 binding), 3-4 lead to 3-4 terminators stem loop
2-3 leads to antiterminator stem loop (prevents 3-4)
If 3-4 pair: structure forms → attenuator → acts as a transcription terminator
2-3 pair → 3-4 cannot pair → attenuator not formed → 2-3 pair has no effect
Ribosome stalls at sequence 1 when trp is low → leads to 2-3 binding
leader region is completely translated in 3-4 binding, trp is high so passes through sequence 1 quickly leading to 3-4 binding
When leader region is completely translated → 2-3 reforms after passing through stem loop created by 2-3
Lecture 4:
Prokaryotic DNA vs Eukaryotic DNA:
Information density: prokaryotes greater than eukaryotes
More genes/bps in prokaryotes
Eukaryotes have repeating DNAs → prokaryotes have unique DNA
Genome association w/ protein (Chromatin)
Prokaryotic genomes loosely associated
Eukaryotic genomes tightly associated with histones and chromosomal proteins: chromatin
1% of human genome encodes for proteins
As organism complexity increases the genes per base pair decreases
Eukaryotes are incredibly complex organisms with very large non-coding regions
Renaturation kinetics/Reassociation kinetics
Help us understand why DNA is repetitive
Eukaryotes need to package data
Compacting DNA vs reliable access to DNA → chromatin is solution
Proteins include specialized structural proteins/enzymes
30 nm chromatin fiber organized in loops that can be individually opened (looped domains)
Sequences will have plateaus depending on their repetitiveness in kinetics chart
Very repetitive sequences will anneal fastest
Very unique sequences will anneal slowest
Prokaryotes have 1 jump in transition
Eukaryotes may have multiple jumps
(3 classes/3 rises, 4 rises/4 classes, etc.)
RNA polymerase in Eukaryotes and Prokaryotes
3 eukaryotic RNA pol vs 1 prokaryotic RNA pol
Eukaryotic RNA pols: rRNA, mRNA, and tRNA synthesizers
Eukaryotic RNA polymerase requires multiple transcription factors while prokaryotic require one or two at most
Eukaryotes have post-transcriptional processing that involves capping, splicing, and polyadenylating mRNA
Prokaryotic DNA generally does not contain introns
Eukaryotic RNA pol requires factors to modify chromatin/looped domains to access genes
Bacterial genes are not packaged like eukaryotic DNA so it can be easily accessed by RNA polymerase
Eukaryotic RNA polymerases
Polymerase:
1 → resides in the nucleolus (innermost portion of nucleus) → transcribes pre-rRNA typically
2 → Nucleoplasm → pre-mRNA and some snRNA
3 → Nucleoplasm → tRNA, some rRNA, some snRNA(spliceosome), and signal recognition RNA 7SL RNA
Eukaryotic core promoter motifs
TATA box
TATAXAX consensus sequence
25-30 bp upstream of transcription start site
Initiator element (Inr)
Overlaps with transcription initiation site
DPE
Downstream promoter element
Extends from about +28 to +34
TFII recognition element (BRE)
Typically the order seen is the BRE → TATA Box → Inr → DPE
General transcription factors (GTFs) form the preinitiation complex (PIC)
DAB F Pol EH (District Attorney Beats Four Policemen Eating Hamburgers)
[TFII (prefix for each)] D→A→B→F-Pol(joins together)→E→H
^^assembly order^^
TF = transcription factor
II = RNAP II
TFIID
Made up of TBP(TATA-binding protein) and TAFs (TBP-associated factors)
Very large
First to bind
Binds DNA in the minor groove
TFIIA
Recognizes core promoter
TFIIB
Recognizes core promoter
TFIIF-Pol
F targets the Polymerase to the promoter
TFIIE
Modulator of helicase
Binds after Pol/TFIIF binds to preinitiation complex
2 different subunits/both needed to stimulate transcription
TFIIH
Helicase
9 subunits
DNA helicase activity/ATPase activity
Kinase activity → phosphorylation of CTD of the large subunit of RNAP
CTD protein Kinase
Xeroderma pigmentosum
Mutations in TFIIH subunits lead to extreme light sensitivity and ultimately cancer
May lead to death in childhood
Cockayne syndrome is the same mutation
REVIEW OF EUKARYOTIC TRANSCRIPTION INITIATION
Low density of coding information
Large amounts of introns
3 RNA polymerases
Pol I and III transcribe rRNA and small RNAs have unique promoter requirements
Pol II transcribes mRNA which leads to gene expression
RNA pol II promoters assemble GTFs (General transcription factors)
Preinitiation complex(PIC) consists of more than 30 individual proteins
PIC does initiate transcription at very low activity
CTD (Carboxyl terminal domain associated with 3’ end of reading frame for mRNA) of large subunit of RNAP helps transition from initiation to elongation
CTD contains many tandem repeats of the heptapeptide Tyr-Ser-Pro-Thr-Ser-Pro-Ser
Number of repeats ranges from 26 in yeast to 52 in humans
Repeats can be phosphorylated
Phosphorylation state differs at different stages of transcription
Ser 2 and Ser 5 are imporant
Transcription Elongation
Largest subunits of RNA pol I, II, and III as well as E. Coli RNA polymerase have many homologies
Pol II is unique in having a CTD with heptapeptide repeats
CTD is indispensable → deletion mutants are lethal in yeast
RNA pol II CTDs have to be unphosphorylated for the PIC to form
TFIIH has to phosphorylate RNA pol II CTDs for RNA elongation to occur
Serine 2 and serine 5 are very important in CTD
RNA pol II is phosphorylated at ser5 to initiate elongation
RNA pol II is phosphorylated at ser2 after bp +50 during elongation
RNA pol II
O = phosphorylated → elongation ensues
A = unphosphorylated → no elongation
When it is phosphorylated DAB is then kicked off
2 proteins regulate elongation as well
NELF → Negative elongation factor
DSIF → DRB-Sensitivity-Inducing-Factor
DRB is an inhibitor of CDK9 which is a component of the positive transcription elongation factor (P-TEFb)
ATP then is used to knock off NELF and DSIF
then elongation can be induced by the P-TEFb
Transcription Elongation factors
Fork loop 1
prevents premature unwinding
rudder
prevents the DNA to rebind to mRNA
lid
wedge and guide the incoming DNA
bridge helix
acts as a ratchet (circular motion in one way)
more than 100 proteins associated with transcription of RNA polymerase II
Lecture 5:
mRNA processing (3 steps)
5’ Cap needs to be added
Splicing the mRNA
Cleavage/Addition of the Poly-A tail
5’ Cap (pay attention to enzymes below)
Function
Protects mRNA from nucleases
Distinguishes mRNA from other types of RNA
mRNA export
Translation initiation (Ribosome binds to the 5’ cap)
How its made
5’-triphosphate Step 1
H2O → H2PO4
The 5’ phosphate is removed and now we have a diphosphate at the end
Guanylyltransferase Step 2
GTP → PPi
Guanosine is added
N7G-methyltransferase Step 3
S-adenosylmethionine → S-adenosylhomocysteine
Guanine is methylated
Capping enzyme is recruited by the CTD of RNA pol II
CTD must be phosphorylated on Ser-5 (negatively charged attracts positive enzyme)
CE = Capping enzyme
Bifunctional
RNA triphosphatase and guanylyltransferase
mRNA Splicing
Function
Allows many proteins to be produced by 1 gene
mRNA export
Translation importance (nonsense mediated decay)
Introns are removed
Alternative Splicing
pre-mRNA can be spliced differently to produce many different mRNAs
Errors may lead to muscular dystrophy
4 elements for splicing
GU-rich sequence is the 5’ splice site
AG sequence 3’ splice site
Branching nucleotide
Adenine
Most important
Pyrimidine rich tract
Splicing steps
2’-OH of adenine is the nucleophile
The Branching Nucleotide’s 2’-OH attacks the 5’ GU
3’-OH of the 5’ GU that was removed attacks the AG at the 3’ splice site
Exon is separated from intron
Spliceosome
Promotes the splicing and is made up of snRNA
Transcription Termination
Cleavage at poly(A) site
Addition of the poly(A) tail at 3’ end
Transcription termination downstream from cleavage site
Functions of poly(A) tail
Protects from exonuclease activity
Important for transport of mRNA
Important for translation
Allows for isolation of mRNA in a lab
Antitermination model for termination vs Torpedo model of termination
Antitermination is less supported than torpedo model
(likely incorrect) Antitermination proposed that there was a antiterminator that attached to the RNAP II and prevented the release of the enzyme until hitting terminator factors that would elicit the release of the RNAP from the DNA
(likely correct) RNAP is stalling and slowing down after making polyA tail → RAT1/Xm2 attacks the poly(A) tail → ends the termination after degrading rest of mRNA (torpedo model) (enzymes involved are important)
mRNA export requires GTP hydrolysis
mRNA out needs Exportin and Ran + 1 GTP
mRNA in needs importin and Ran + 1 GTP
Eukaryotic Transcription Regulation
Sequence specific transcription factors
Promoter elements
Eukaryotic promoters are OFF in the absence of regulatory factors
Bacterial promoters have a basal/low transcription rate
Enhancers
Activating and repressing mechanisms
Mediators
Chromatin remodeling
Allows transcription → loops of DNA are unpacked in order for transcription to initiate
Histones: basic proteins that package and order eukaryotic DNA into units called nucleosomes
Euchromatin → loosely packed DNA (more expressed
Heterochromatin → tightly packed chromatin
Transcribing a gene →
HATs encourage transcription by decreasing the affinity of nucleosomes → leading to less tightly bound DNA to histones
HDACs increase the affinity of histones for DNA → more tightly bound to each other
Eukaryotic promoters must be activated → RNA polymerase have little to no affinity for promoters without additional factors
Combinatorial control: specific combination of transcription factors must be bound at the promoter in order to express a specific gene → large eukaryotic promoters can bind many transcription factors
General TFs bind at core promoter
Transcriptional activators that bind DNA and co-activators which bind the activators bind at regulatory sequences both upstream and downstream
Cis vs Trans activiating elements (not to be confused with the chemical terminology of cis- and trans- conformations)
Cis-repsonsive elements are elements within the DNA sequence that are not the promoter but bind transcriptional factors
Transcription factors or (trans-acting regulators): are proteins that bind cis-responsive elements
Transcription factors
bind to enhancers(DNA) and mediators(proteins)
Contribute to the chemical modification of the PIC
Stimulate elongation of RNAPII
Act on the level of chromatin
DNA binding and activation/repression domains:
Transcription factors have a bimodal composition
One domain recognizes a specific DNA motif or “DNA binding domain”
Zinc-finger sequence motif
Binds to major groove
Complex formation between 4 cysteine/histidine residues and a zinc ion
Helix-turn-helix motif
Binds to major groove
Form dimers (if one of the proteins is mutated it will not function properly)
2 helices → one recognizes and fits into the major groove
Leucine Zipper
Leucines are spaced 7 proteins apart
Dimers as well
Second domain affects transcription activation
This can lead to many more combinations of Transcription factors such as if there are 8 transcription factors you can have 64 combinations
Eukaryotic promoters:
Sigma factor helps bind RNAP to binding sites in prokaryotes
eukaryotic promoters are
Sequences bound by PIC (Core promoter)
binding sites for transcriptional activators (regulatory promoters)
Interacts with a specific target sequence which is sometimes close to the transcription start site
Transcription is controlled by a promoter and an enhancer
Separation from enhancer may be in several kbps to the promoter VERY FAR
Enhancer
Location and orientation of enhancer is independent and functions at a distance
Enhancers can be upstream or downstream and still provide the same function
DNA looping occurs between regions bound to activator proteins
DNA binding domain on an enhancer is bound by an activator which then binds to the PIC which is bound to the core promoters (TATA,Inr,DPE)
Enhancers and Silencers
Activators(protein) bind to enhancer sequences
Determines which genes are switched on and increase the speed of transcription
Repressors(protein) bind to silencer sequences
Interferes with the functioning of activators and slows transcription
May either interfere with the binding site of the activator (the enhancer sequence) (competitive DNA binding)
may bind to the activator as it may “mask the activation surface” of the PIC
May bind to the GTFs in the PIC and “directly interfere with the binding of the activator” on an enhancer sequence to the PIC
Enhancers(cis element) need a mediator(trans element)
Specifically interacts with the CTD of the large subunit of RNA pol II
CTD attracts the attachment of many additional factors including
Termination factors, splicing factors, elongation factors, mediators
CTD acts as an assembly line for tools needed for promoter clearance and RNA processing which is coupled to the elongation process
mRNA might get fed through this line of factors which are bound to the CTD
Binding of an activator to an enhancer recruits RNAP II through mediator/RNAP complex
When many enhancers aim to promote transcription their effect is synergistic
I.e. 1 activator may produce a synergistic effect of 1 unit
4 activators may produce a synergistic effect of 500 units
Insulators block activation by enhancers, or block repression by silencers.
Lecture 6:
Chromatin packaging hierarchy
Nucleosome forms (DNA + Histones)
30 nm fiber (coiled nucleosomes)
Nuclear scaffolding w/ looped domains
Metaphase chromosome (mitotic)
Histones made up of octamer (2xH2A,H2B,H3,H4)
146 bps wrap around a histone (2 turns)
Histones contain many basic amino acids (Lysine & Arginine)
Histones as a result are positively charged
DNA is negatively charged
Strong electrostatic interactions
DNA packaging + transcription
Heterochromatin vs Euchromatin
Euchromatin = histones on the outside of the plane which allows for transcription
Heterochromatin = histones are tightly wrapped and adjacent limiting transcription
How to alter DNA packaging
Chromatin Remodeling
Chromatin remodeling complex exposes promoter that may be bound to nucleosome
Brings out the promoter to the string
Histone tail modifications
Histone tails can be modified to create different changes to the histone itself
Histone modifications influence transcriptional activity
Acetylation
HATs (histone acetyltransferase)
Lys residues at N-terminus
Enhances transcription by destabilizing nucleosomes
Deacetylation
HDACs (histone deacetylases)
Stabilizes compact chromatin structures
Represses transcription
Methylation
Lys and Arg residues
Repression
Phosphorylation
Ser residues
Activation
Ubiquitination
Lys residues
Activation
Remodeling and Modification usually work together
Chromatin Remodeling Complex is SWI/SNF
Methylation
Chromodomain
Allows proteins to bind methylated histones
Chromoshadow domain
Allows HP1 to bind to other HP1 Protiens
Epigenetic code
Covalent changes to histones and DNA create changes that may alter gene expression and result in being read like the genetic code
Transcription Regulation and Cancer
Cofactors
Binds transcription factors without making DNA contact
Can be super-activators (cofactor)
Can repress the activity of the TFs (corepressor)
E2F → constitiutive transcription factor in the context of certain cellular genes
Binds the cofactor RB and loses activation function
Cell cycle is regulated by transcription factors and co-repressors
S phase = replication of chromosomal DNA
G1 phase = no replication of chromosomal DNA
G1-S transition: express proteins needed for replication of DNA like DNA polymerase
G1 co-repressor(RB) blocks E2F function
Regulated phosphorylation of RB leads to it’s removal and reactivation of E2F → leading to S phase
In many cancers the RB gene is mutated and cells permanently go from G1 to S without stalling
RB mutations may lead to cancer
RB originally found mutated in cancers of the retina
Some of the most frequent mutations leading to cancer
Transcription regulation is central to carcinogenesis
Normal cells have cdk2 to phosphorylate RB and remove it from E2F to have typical DNA replication
Cdk2 is negatively controlled by a different factor (p21CIP)
p21 transcription is activated by p53
p53 availability determines E2F repression by RB
p53 is a tumor suppressor → loss of function/mutation would potentially lead to uncontrolled tumor proliferation
p53 available(up) → p21 up → cdk2 down → RB up → E2F down → G1 does not go to S phase (inverse is true for p53 mutation)
Many viruses target p53 and RB and cause cancer (papillomaviruses, polyomaviruses, and adenoviruses)
siRNA
mRNA + tRNA and →
small stretches of RNA which complements part of mRNA (siRNA)
Double stranded RNA is responsible for lowering RNA levels
RNAi → RNA interference
Long ds-RNA (double stranded)
RISC(big complex that contains dicer and argonaute) → Dicer cleaves ds-RNA and siRNA
Protein breaks strands into single strands
Delivers to RNA
Transcript is degraded
Small RNAs / RNA interference
Previously seen that small RNAs exist in the cell
Problem → small RNAs found in the cell siRNA or degradation products of other larger RNAs
Are they natural?
Yes → are they degradation products or siRNA tho
Northern Blot → sequence of small RNA
Sequence found in genomes are very similar sizes
NOT random degradation
Endogenous RNA → microRNAs(miRNA)
21-23 nucleotides
Primary mRNA transcribed by RNA pol II
Different parts of RNA interference
RISC → RNA interference is done by this complex
Dicer → cleaves the DS RNA
Argonaute → cleaves between siRNA and mRNA
Drosha processes microRNA
pre-miRNA then is loaded onto RISC
Processed by RISC
siRNA vs miRNA
siRNA is artificial (Gene silencing)
miRNA is encoded by genome (RNAi)
miRNA leads to translation inhibition
siRNA leads to translation inhibition and/or degradation
CRISPR-Cas
Clustered Regularly Interspaced Short Palindromic Repeats
Used by bacteria to fight viruses
3 steps
Acquisition - cas locus → binds to virus DNA → GGG sequence → cuts 20 bases upstream
Expression - bait the DNA from the virus
Interference - Cas protein bind to CRISPR DNA
Similar to siRNA but the difference is it involves DNA
CRISPR and Eukaryotic cells → possible to create specific genome modifications
Required are Cas9(nuclease), Gene specific CRISPR RNA (crRNA), and tracr RNA → links crRNA to Cas9
3 Genome editing techniques
Zinc finger (motif)
Binds DNA and cuts with endonuclease (FolkI)
Issue is that zinc finger has to be remade each time
TALEN (helix-turn-helix)
Binds DNA and cuts with endonuclease (FokI)
Issue is 9 proteins needed to be bound
Cas9/CRISPR
gRNA binds DNA, cuts by an externally added endonuclease (Cas9)
Allows highly specific binding and linked to trcrRNA
BEST strategy
gRNA??? → guide RNA → crispr RNA
RNAi inhibits an RNA, CRISPR/Cas9 will inhibit DNA (major difference)
Human collections generated by Cas9/CRISPR include genes essential for cancer cells and genes important for resistance to chemotherapy drugs
Lecture 7:
Translation by ribosome
Proteins differ from nucleic acids:
20 amino acids vs 4 nucleic acids
Large variety of functional groups
Accelerate a multitude of chemical reactions
well-defined tertiary structure (shapes)
4 nucleic acids ^ 3 codon slots → 64 combinations of nucleic acids
tRNA
Transfer RNA
Ribosome
Factory to make
How is code read?
Unpunctuated code → deletions of 3 nucleotides would restore the reading frame
The two wrong proposals were overlapping code and punctuated code
Deletions
1-2 frameshift mutation
3 nucleotides = removal of 1 aa
Insertion of 3 nts → insertion of 1 aa
Change of 1 nt results in a missense or nonsense mutation
Filter experiment
Synthetic mRNA that codes for certain amino acids added to solution
Filter binding assay
Filter contains mRNA and only the corresponding tRNA would attach to filter → aa that is radioactive would then be stuck on filter
There are variants to the genetic code
UGA = Trp, AUA = Met, AGA = Stop in mitochondria
Genetic code
Non-overlapping, no spacers
Almost universal
Highly degenerate = many aas are specified by two or more codons
Unambiguous = codons specify ONE aa
Crick’s adaptor hypothesis
tRNA
AA – Adaptor(tRNA) – Nucleic acitd
tRNA cloverleaf secondary structure
Amino-acid arm - conserved CCAOH → attaches to AA
Anticodon arm - read antiparallel to mRNA
D-arm
Extra arm
TψC arm
7-15% of tRNAs contain modified nucleosides
Post Transcriptionally modified
Adenosine at 5’ anticodon changes to I (Inosine) → can base pair with A, U, C → “Wobble theory” / third base wobble
tRNA ala found to bind to GCA, GCC, and GCU → Inosine is the 3rd base
Inosine is highly ambiguous
Allows for more conservation of energy
32 tRNAs for 61 sense codons (3 are stop)
Non-canonical base pairing in one base provides weaker interaction between anticodon and codon → higher rate of protein synthesis
tRNA folds into a L shaped 3D structure
Active sites include anticodon and amino acids and they are maximally seperated
The anticodon stem and acceptor stem form double helices
Aminoacyl-tRNA synthetase
Matches tRNA vs AA
AA attacks ATP → becomes adenylated
Aminoacyl-AMP formed → 2’ OH of tRNA attacks C which releases AMP (which is a great leaving group)
How do aminoacyl-tRNA synthetases know they have the correct AA attached?
AA too large → does not fit
AA too small → fits → proofreading site within enzyme → removed through hydrolysis
Lecture 8
Ribosome History
Who won the ribosome race?
Tom Steitz
Who first crystallized the ribosome?
Ada Yonath
Ribosome
4 binding sites
Large subunit
E - Exit
P - Peptide
A - Amino
Small subunit
mRNA binding site
2/3 of the ribosome is RNA and 1/3 is protein
Proteins are just there to stabilize the RNA
Translation
aa + ATP + tRNA → aa-tRNA + AMP (hydrolysis of ATP = loaded aminoacyl tRNA)
Bacterial initiation utilizes fMet (formyl-Methionine) and initiator tRNA
fMet is loaded by 2 enzymes → Methionyl-tRNA synethetase & Methionyl-tRNA formyltransferase
Difference between fMet and Met tRNA is that there is a mismatch between AC which provides a kink !!!DIFFERENCE IS IMPORTANT!!!
fMet-tRNA to AUG (loads to P site) → regular methionine would load to A site
fMet-tRNA binds to AUG
Prokaryotic translation
3 initiation factors and GTP hydrolysis
30S subunit (small subunit)
IF-1 Bound to the amino site
IF-3 bound to nowhere (in relation to APE) (E site bound during initiation)
Order bound is:
IF-1 → IF-3 → mRNA → then IF-2 binds (explained below)
Shine-Dalgarno (SD u can write SD so it’s prolly gonna be on the exam) sequence complementary to the 16S rRNA
16S rRNA is within the 30S subunit (small)
Made up of other proteins as well
fMet-tRNA is accompanied by the IF-2 that binds to IF-1 which is within the A site
So what happens in order now is after the IF-1 binds to the A site and IF-3 binds to the E site → mRNA is now fed through the ribosome in which the SD sequence complements to the 16S rRNA → and the IF-1 is now within the A site → IF-2 binds to the IF-1 within the A site and carries the fMet-tRNA along with it.
After all this happens the 50S subunit binds to the 30S subunit making the 70S ribosome and knocks the IFs off the small subunit
Something to note here is that fMet still remains on the tRNA that now resides within the P site
To clarify once more:
IF-3 is nowhere until initiation
IF-1 is in the A site
IF-2 is on top of IF-1 (bc it binds later)
fMet-tRNA is in the P site (comes in with IF-2)
Riboswitch inhibition → feedback inhibition
If an end product M is present → loop formed at AUG → ribosome falls off → translation is OFF
Eukaryotic Translation Initiation
Many eIFs → no fMet
Eukaryotic initiator Met-tRNA has unique features
12 different IFs
Eukaryotic translation steps
Small subunit complexes with eIF1, eIFA, eIF3
Initiator Met-tRNA w/ eIF2 and Met binds to complex w/ eIF5B-GTP
mRNA 5’ cap binds to eIF4F
mRNA-eIF4F binds to the preinitiation complex
The Kozak Sequence establishes the reading frame → binds to eIF4F
polysome formation involves 5’ cap and poly(A) tail
rRNA binds to mRNA through base pairing
Scans till AUG
Translation Initation complete
Locations of each of the eIFs is more important though
eIF3 is bound to nowhere (on the opposite side of the ribosome of the APE site)
eIF1 is bound to E site
eIF2 is bound EXCLUSIVELY to the Met-tRNA (does not touch the ribosome)
eIF1A is bound to the A site
eIF5B is bound to eIF1A which is bound to the A site
Met-tRNA is bound to the P site
eIF4F binds to the mRNA’s 5’ cap → which then binds to the ribosomal subunit (does not bind to a site)
eIF2 mutation leads to huntington’s disease
Elongation
Elongation factors are needed for chain elongation
Steps
1. aminoacyl-tRNA binds to A site
2. tRNA moves over to the P site and a new peptide bond is formed
3. ejection of spent tRNA from the E site
Ribosome catalytic base is an Adenine
prokaryotes:
fMet-tRNA binds to the P site and and GTP-Tu binds with the second aminoacyl-tRNA at the A site → GTP hydrolysis → Tu falls of and forms Ts-Tu complex → GTP then binds to another Tu to continue the process
Peptide bond is formed → aa chain is at the A site
EF-G-GTP binds the A site
Translation requires GTP hydrolysis
Puromycin binds to A-site
Transcription if from 5’ to 3’, ribosomes move 5’ → 3’
Termination
Protein is made → Stop codon
A site scans for stop codon → binds RF2 (release factor)
binds to A SITE
GTP hydrolysis is involved in termination
hydrolyzes polypeptide
RFs
RF1 recognizes UAG or UAA stop codon
RF2 regonzises UGA or UAA stop codon
RF3 - stimulated the rate of peptide release by RF1/RF2 but does not act independently
difference is not tested
eRFI (human translation release factor)
looks like a tRNA w/ and L shape
eRFI, RF1, and RF2 all bind to the A site
tmRNA
compare it to tRNAAla
ribosome rescue pathway (method 1)
tmRNA binds to A site
Ala on tmRNA attacks the polypeptide chain
frees the ribosome from the polypeptide chain
Ribosome Recycling (method 2)
RRF binds !!!A site!!! (just pay attention to A site bascially w/ termination)
EF-G-GDP binds to A site
pushes RRF to P site
Then IF-3 comes in to the E site and kicks RRF and EF-G off