1/27
Oct 29th, 5th rick roy lecture
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Cis-Regulatory Elements
Sequences that control gene transcription
Act on the same double stranded DNA molecule from the gene they came from
Linker-scanning mutational analysis
technique used to locate cis regulatory sites by dissecting a control region
Reporter genes are used to identify important sequences

What is EMSA? What is it based on?
Electrophoretic Mobility Shift Assays
Based on interacting a sample that contains proteins, with a specific DNA fragment to see if a protein in the sample can interact with the DNA sequence provided.
How do we label the DNA substrate so that we can see it in EMSA? What forms?
Making a radiolabelled double stranded DNA probe that corresponds to the sequence you’re interested in
For example, take 3 subregions that were found before in linker scanning (we know their sequence) and make PCR primers to generate a double stranded DNA probe
Combine the probe with a mixture of proteins to see if some will bind
a. These proteins can be obtained from a nuclear extract (all proteins present within the nucleus - so we DNA binding transcription factors should be present as well)
b. If the double stranded probe and the sequence bind together, a DNA binding protein substrate complex is formed which is stable

Why do we run the reaction mixture of the probe and nuclear extract on a non denaturing polyacrylamide gel? How does the run look?
Since we do not want the association between the protein and sequence to be disrubted
Free DNA moves faster down the gel
DNA bound to the protein moves slower, causing a mobility shift (indicating that properties have changed since the protein is bound to the DNA)
No shift means no binding occured
This tells us that the CIS-acting regulatory element we identified is bound by these proteins (ex. nuclear extract mixed with substrate)

Reading the EMSA (on lane, free probe & bound probe) what does it tell us?
ON Lane is the positive control - proteins present in the nucleus that interact with the DNA regulatory element
Free probe - most of the substrate doesn’t bind to anything
Bound probe - mobility shift in lane 7,8 & 9
Same size in lanes suggests its the same protein interacting with the DNA

How do we then identify the proteins that binded to the cis - regulatory elements in EMSA?
Proteins present in the nuclear extract are eluted which interact with DNA substrate (probe)
These are the proteins responsible for binding to the CIS regulatory element
The proteins are then further purified/isolated
What are In Cell Assays used for and when?
Used once you have identified proteins that you THINK contribute to the activation of transcription to make sure they actually do
How do we do In Cell Assays?
Clone the sequence for the protein (that may be a DNA binding transcription factor)
Insert it into a sequence of cDNA, and then into an expression vector
- Once the vector gets into the cell, the protein will be expressed, translated and functional
Clone the CIS element into another vector
This will be the region we THINK the protein will bind to after completing the gel shift anaylsis
This will drive the transcription of the reporter gene (it is a promoter region which is upstream that contains a TATA box, etc)
Co-transfect both vectors into a cell
If the protein identified interacts/binds with the CIS regulatory element within the cell, it should activate the transcription of the reporter

How do DNA binding transcriptional factors work?
They contain helical regions made of amino acids called recognition helices
These interact with the major groove of DNA (non-covalent binding)

What does it mean when something has a “Modular Structure”?
Multiple distinct domains that perform different functions
Most transcription factors are modular
Modular Structure Example (GAL4)
GAL4 is DNA binding transcription factor that is important to activate galactose metabolism in yeast
Found to bind to specific upsteam activating sequences (UAS), and when it interacts, it will turn on the transcription of that promoter (specifically lacZ)

How did we discover that transcription factors had modular structures?
Mutations were made on the N and C terminus of the GAL4 protein, creating many variants
What happened when we mutated the N-terminus in GAL4?
When the very end of the N-terminal region was removed of the GAL-4 DNA binding transcription factor, they noticed that it lost all binding to the UAS. This suggests that the DNA binding region was present at the very end of the N-terminus (first 50 amino acids)
If GAL4 doesn’t bind to UAS, transcription of promoter does not occur and there is no beta galactosidase activity.
What happened when we mutated the C-terminus in GAL4?
If you eliminate a specific sequence at the C-terminus, you no longer have transcriptional activation, BUT you can still bind to DNA (in this case, the UAS). Still, there is no beta galactosidase activity without the transcription of the reporter gene
This suggests there is a completely different function in that transcription factor that is independent of DNA binding
What happens when internal regions of GAL4 are removed?
you can still activate transcription pretty effectively!
This suggests that as long as you can bring GAL4 to the UAS and you have the necessary region at the C-terminus, you will activate transcription!

Transcription factors are modular proteins that have distinct domains.3 main domains.
DNA binding domain: recognizes specific DNA sequences
Activation domain: interacts with other proteins to activate transcription
Flexible Linker region: connection between domains
Others like: transcription repression, chromatin remodeling, nuclear import and protein interaction

What was found when you introduced a UAS into a mammalian cell? (modular nature across organisms)
If you introduce a UAS into a mammalian cell system with a reporter gene, you could take a mammalian transcription factor, attach a GAL4 DNA binding domain, transfect that into mammalian cells and activate the reporter that activates the UAS in those mammalian cells
Modular nature of the domains can be swapped across species where activation still works
Homeodomain Protein Class
Contains alpha helices that bind to the major groove in DNA
Specifies positional information during development (ex. puts everything where it should be - legs go where legs go, etc)
Highly conserved across organisms
Mutations cause body part transformations (antennapedia - eyes where antenna should be)

Zinc Finger DNA Protein Class
Interact with DNA through a specific domain that causes a finger-like structure (protrusion)
finger structure generated by an interaction between zinc and cystine/histidine
Ex. C2H2 → 2 cystines, 2 histidines
C4 → 2 finger units which bind to DNA as homeo or hetero dimers (ex. steroid hormone receptors)
C6 → 6 cystines bind to 2 zinc ions (like GAL4)

Leucine Zipper Proteins
Have leucine rich hydrophobic domains (bZIP proteins)
bind to major groove of DNA as homo or heterodimers
Hydrophobic residues form a coiled domain due to every seventh position in the C terminal region containing leucine or a different hydrophobic amino acid which is required for dimerization (2 proteins joining together)

Helix-Loop-Helix Proteins
Similar to bZIP/Leucine Zippers but have a loop separating the two helices
In the C-terminal part of the DNA-binding domain, the helix-loop helix’s amino acids are arranged so that hydrophobic ones appear at just the right spacing to form an amphipathic alpha helix.

Cooperative binding
Can increase stability vs when just one DNA binding transcription factor binds, increasing transcriptional efficiency
Cooperative binding example
NFAT and AP1
2 transcription factors that individually act very poor to activate transcription of IL-2
But when you put them together, they bind to each other and stabilize the formation of an efficient complex to activate transcription very efficiently

Diversified Gene Regulation
The combination of transcription factor binding sites in promoters leads to a diversity of transcriptional responses
Done through the ability to homo and heterodimerize (many possible combinations)

Chromatin Immunoprecipitation (ChIP-seq)
Instead of just looking for a protein, you are looking for a protein bound to DNA
You’re taking the chromatin, the DNA and its chromatin coat that you purified from nuclei then chop it up into small bits
Interact that mixture with an antibody that recognizes the given protein you are interested in
Antibody will recognize that protein and, if its bound to a DNA sequence in the context of chromatin, will immunoprecipitate the whole complex
Can identify which sequences are being bound
How can you evaluate what the genes are as a result of Chromatin Immuno-Precipitation (ChIP-seq)?
Caudal is an example
Through Ch-IP seq we can identify which sequences are being bound
Cdx2 interacts with all the sites in the diagram
many of them are homeobox genes (Hoxc5)
Caudal is affected expression of other HOX genes
We can use statistics to understand the sequences bound by Caudal

ChIP-seq to figure out if a protein is binding to a sequence with PCR primers
We want to know if protein X is binding to a region in a gene
A ChIP is performed followed by a PCR to test if the sequence was precipitated
This tells us if the protein interacted with the sequence or not