When we finish talking about prokaryotes. So, okay, this whole process of transcription is carried out by a single enzyme called RNA polymerase. Very similar to DNA polymerase. Remember DNA polymerase built DNA. Well, RNA polymerase is going to build RNA. It makes sense. And again, my figures are going to show RNA polymerase as a little oval blob or whatever, but nope, it's a super complicated protein with a massive quaternary structure. Whatever. But yeah, this is RNA polymerase.
So, we could talk about the process of transcription as carried out by RNA polymerase as three distinct phases:
Initiation: Initiation begins when the RNA polymerase, yep, here's the orangish tan blob, where the RNA polymerase binds to the DNA. Not just anywhere on the DNA. The RNA polymerase is going to bind to a specific region of the DNA called the promoter or promoter region. The promoter is right before where you're going to start transcription. The gene that you are transcribing is going to be right after this promoter. The whole job of the promoter is to tell RNA polymerase where it's supposed to bind so it can start things. You don't want to start this gene in the middle. You want to start three quarters of the way through. You want to start at the beginning. And the only way RNA pol knows where the beginning is is if it has some sort of indication of that, and that's the promoter. You'll notice in this figure, the TTGACG and whatever, don't memorize these exact sequences, but it is important to note that the promoter does have a very specific sequence. This RNA polymerase knows what a promoter is and where the promoter is and how to bind to it because of the sequence that the promoter has. If the promoter did not have the correct sequence, RNA polymerase would not be binding to it. So an important part of this. So yes, initiation. RNA polymerase binds to the promoter. Promoter is defined in the key terms as a region of DNA with a specific sequence. Again, not a very long definition, but it is a very specific sequence. Importantly, the promoter is upstream or right before where transcription will begin.
Comparison to DNA Replication: Now, in some ways, we can compare the process of transcription to the process of DNA replication. In DNA replication, we needed to create a single-stranded DNA template so we could build DNA complementary to the single-stranded DNA, and we used helicase to do that. We're doing a similar thing here in transcription. We need single-stranded DNA in order to build RNA complementary to it. The difference is we are not going to be transcribing the entire chromosome. So we do not need to unzip the entire chromosome like we did with DNA replication. Some unzipping needs to occur, but not this sort of large-scale thing. The unzipping is going to happen in what is called a transcription bubble. So here we can, this is a good figure to summarize the whole process of transcription. Here we've got our DNA double helix, DNA double helix. Here's RNA polymerase, the orange bean or the purple bean right here. And yeah, you can see in this region right here, it has been unzipped. There is single-stranded DNA, and there is single-stranded DNA. This little bubble where it's been unzipped is called, appropriately, the transcription bubble. So we're still in initiation. RNA polymerase binds to the promoter, and it forms the transcription bubble. The key terms define a transcription bubble as a region of locally unwound DNA that allows for transcription. Again, it's unwound, it's unzipped, it's single-stranded, but it's just this local region right here.
Elongation: Okay, now that we've started things with initiation, our second phase is called elongation. This is where we're going to actually start building RNA. The way that RNA polymerase is going to build RNA is in the5' to 3' direction. Oh no, not this again. So yes, getting flashbacks to the last chapter, last exam maybe. This enzyme, RNA polymerase, has the same limitation that DNA polymerase had in that it is only able to construct its thing in the 5' to 3' direction. We can see this here. We've labeled the strands of DNA, the RNA, and yes, this arrow is showing the direction of unwinding, the direction of this transcription bubble. The promoter must have been up here somewhere. The bubble is moving along, and here is the RNA being constructed 5' to 3' in that direction.
Now, in the last chapter, this became a nightmare for us because of the leading strand, lagging strand stuff. Luckily for us, if we look at this figure, essentially this is just quote-unquote leading strand, 5' to 3', toward the site of unwinding, toward where the transcription bubble is moving. There is another template available here. There's more single-stranded DNA available here. It is not transcribed. So there is no lagging strand. In DNA replication, we had to replicate both; in transcription, we're just transcribing one of these. So crisis averted. This turns out not to be that complicated, but it is still important to note that it is building in that direction.
Now, there are two very important terms when it comes to talking about the DNA as it is being read by RNA polymerase to build RNA. You can see them labeled here, coding strand and template strand, but what do those mean? Well, the RNA sequence that is being built is complementary to the template strand of DNA. So real quick side note, because we are building an RNA-DNA hybrid here. If you look here, this is RNA-DNA. Yeah, this is double-stranded DNA-RNA. They're hydrogen bonding with one another. Yeah, double-stranded DNA-RNA hybrid.
The rules of base pairing when it comes to DNA, base pairing with RNA, is:
A C in the DNA will pair with a G in the RNA.
A G in the DNA will pair with a C in the RNA.
A T in the DNA pairs with an A in the RNA.
An A in the DNA pairs with, oh, not a T in the RNA. Remember, RNA does not have thymine. It has uracil instead. An A in the DNA pairs with a U in the RNA.
So this is the one little difference between the rules of base pairing we learned for DNA to DNA in the last chapter. Again, I know I introduced this way earlier in the quarter, but I'm bringing it up again. RNA uses uracil instead of thymine. So as we're constructing this RNA, it's going to partner up with the DNA according to these rules of complementarity. So, like I said, the RNA sequence that we build is going to be complementary to the template strand of DNA. Here is the template strand, the one down here, and you can see it's being built complementary to this. The other strand is called the coding strand. And the RNA sequence that we are going to build, the piece of RNA that we are building, is going to be the same as the coding strand.
Now, these two statements might be a little confusing, so I want to give an example. Do not memorize these exact sequences. They're just examples. Let's say we have a coding strand up here, and this is the sequence: A-T-G-C-T-A-A-G. Whatever, I just made it up. We know what the template strand sequence is going to be like. The template strand is complementary to the coding strand. These are just two strands of DNA. DNA is always complementary to itself in its two strands. A pairs with T, T pairs with A, C, G, A, T, T, C. Easy, it's DNA to DNA. So what about the RNA? Well, as I said, the RNA sequence that we build is going to be complementary to the template strand. So reading the template, again, it's called the template. It's what we read. It's what's used. T inthe template is going to pair with A in the RNA. A in the template is going to partner with U in the RNA. C to G, G to C, U in the RNA. C to G, G to C, A to U, T to A, T to A, C to G. This is what I mean when I say the RNA sequence that we build is complementary to the template strand of DNA.
But what about the other statement here? The RNA sequence is the same as the coding strand? Remember the coding strand is the one that we did not read off of. The coding strand is the one that's just sitting here not being touched at all. Well, let's compare the RNA sequence to the coding strand sequence. They're not complementary, they are the same. A, A, the same. U, T, okay, they're not exactly the same. Remember DNA uses thymine and RNA uses uracil instead. But they are equivalent, they are the same. G, G, C, C, U, T, again that thing again, A, A, A, A, G, G. It's the same sequence. The T's and U's are different, but other than that the coding strand is the same as the RNA strand.
So again, this is a great example to return to if you get lost in these statements. I'm trying to make these as pithy and succinct as I can. But these are the two important statements here to understand what these two terms mean, coding strand and template strand:
The RNA sequence is complementary to the template.
The RNA sequence is the same as the coding strand of DNA.
Okay, now the other thing that's happening here is as we are building RNA complementary to the template, the transcription bubble is moving along. It is going to unzip more of this DNA, and it's going to zip back up again as the template, as the transcription bubble sort of passes by. So yeah, elongation. The DNA rewinds as the transcription bubble passes by. And this is going to continue on until we make it to our third phase called termination, where things are going to end.
There are a couple of different ways that this can happen. Let's start by talking about something called row-dependent termination. So I didn't show it earlier because I was trying to keep things simple, but in a lot of transcription, there's an extra protein called row factor that binds to the RNA early on, and it moves along in the same direction as the transcription bubble. The row factor is sort of like chasing down the RNA polymerase, chasing down this transcription bubble. But it's not going to catch the RNA polymerase because they're moving at about the same speed.
However, if this RNA polymerase hits a bunch of G's in a row, it's just something about guanine in specific and having a bunch of those at the same time, G, G, G, G, G, G, G, hitting a bunch of G's all together is going to cause RNA polymerase to stall a little bit, to slow down a little bit. That is going to allow the row factor protein to catch up to it, and that is going to terminate the whole thing. It's going to cause the RNA to be released. It's going to cause the RNA polymerase to release. The DNA is going to zip back up again. Everything is going to be done. You're finished. And it always terminates at the same place because this is going to happen right after RNA polymerase hits a G, G, G, G, G, G sequence in the DNA.
So to summarize, in termination:
The row protein follows behind the transcription bubble.
At a sequence of many G's in a row, the RNA polymerase stalls, row catches up to it, and ends transcription.
I mentioned there were two common ways in which transcription can be ended. Row was just one of these. The other way that transcription can end is what is called row-independent termination. This does not involve the row protein. This involves something called an RNA hairpin. So we're talking about, yeah, if you don't have long hair, maybe you're not familiar with these. This is a type of hairpin. And yeah, this is named after this hair accessory. Because yeah, this is RNAand then a loop and then more RNA. So RNA is not normally double-stranded. But if you have a sequence of RNA that is CGGGCGA and then after a short while later a sequence of RNA that is exactly complementary to that, yeah, RNA will form a double stranded structure if it has this kind of opportunity. And as it turns out, the formation of this kind of hairpin in the RNA is enough to yank the RNA out of, you know, RNA polymerase is not shown here, but we see the little transcription bubble. So RNA polymerase would be here building this RNA. The creation of this RNA hairpin is going to be enough to yank this out of RNA polymerase and cause the whole thing to terminate and everything releases and you have your RNA.
So still under termination, a hairpin structure forms in the RNA that causes transcription to end. That's all I want to say about the row independent. You get a hairpin in the RNA that makes transcription stop. Either one of these, row or the hairpin, are ways that we can cause termination during transcription.
What about us? What about eukaryotes? I told you I would do this compare and contrast because technically all of this was prokaryotes. So what about eukaryotic transcription? Luckily for us, it's fundamentally the same as prokaryotes. You have the same three stages. We've got a promoter, we've got a transcription bubble. The basics of everything I just talked about are the same. There are a few small differences though. One of them is that eukaryotes, more complicated as always, actually have three different versions of the RNA polymerase enzyme:
RNA pol 1
RNA pol 2
RNA pol 3
Don't memorize this table, but it's just kind of showing you that these different polymerases build different types of things and are active in different parts of the cell. And yeah, just have slightly different jobs. Prokaryotes only have a single RNA polymerase, but eukaryotes, we've got different ones that do different things. So yeah, fundamentally the same. But, as I list the differences:
Three different RNA polymerases
Four different types of RNA
The other major difference has to do with processing. So, here's an example of a sort of raw transcript from a eukaryote. You have the beginning, the middle, the end. Within this piece of RNA, there are a couple of different types of region. There are exons and introns. So this is the raw product of transcription. This is what we build straight fresh out of RNA polymerase. Before this transcript is ready to go on to translation and do its job, to be a blueprint, to tell the cell how to build a protein, you need to cut out the introns, remove the introns, and glue together the exons. You'll notice how, yeah, this is our raw transcript. This is our final product. The introns have been removed. The exons have been put together. And again, this is a eukaryote thing. Prokaryotes don't do this nonsense. Only we do.
Introns are spliced out, and exons are kept in. Introns are defined in the key terms as an intervening sequence. That's what int is supposed to be short for, intervening. You could think of it as in the way, because ultimately these things need to be removed. Introns, intervening sequence that is spliced out of RNA during processing. That's what the key terms say. And exons are kept in. The key terms definition of exon is an expressed sequence. That's what the X is supposed to be short for, expressed. An expressed sequence that is kept in the RNA after processing. So, introns removed, exons kept in.
Why, though? This seems like a tremendous and stupid waste. Why do the introns even exist in the DNA if you'rejust going to remove them? What's the purpose? Well, the answer to why these things exist is, I think, unfortunately, or maybe fortunately, well outside the scope of an introductory biology class. There are fascinating examples of introns being left in and some of them being taken out in order to get multiple different RNAs out of a single starting RNA. There are examples of changing the order of exons and how they're spliced back together in order to get multiple proteins out of a single gene. None of that is worth going into in this course.
So yeah, I'm sorry that I'm not really answering the question of why this happens. But for here, you need to know that this is something eukaryotes do that prokaryotes do not do. We've got introns that we remove, we've got exons that we keep in.
There's one last difference between eukaryotic transcription and prokaryotic transcription, and that has to do with the size of these cells. So yeah, this is approximately to scale. The prokaryotic cell is pretty small. So if you build an RNA transcript, for that transcript to get where it needs to go in the cell is pretty easy. It doesn't have to travel very far.
However, in eukaryotes, transcription is taking place in the nucleus, and that piece of RNA needs to go, well, lots of different places. Maybe it needs to go here to the rough ER. Maybe it needs to go out here to the cytoplasm. It has to travel across the cell, across cell membranes sometimes, and it has to make quite a lengthy journey.
So because of this lengthy journey that eukaryotic transcripts need to make, our transcripts need to be processed. So yeah, here's our exon, exon, exon. And we add some extra things to the 5' end and some extra stuff to the 3' end to just extend its shelf life so it can make it to where it needs to go and to mark it for transport so it can be sent to where it needs to go.
So let me summarize this. RNA, again, this is a eukaryote thing, RNA is processed in order to extend stability and to facilitate export from the nucleus, to get it out of the nucleus where it's made to wherever it needs to go and have the shelf life to do that.
So again, the fundamentals are the same between prokaryotes and eukaryotes, but we have multiple RNA polys, we've got introns, exons, we've got processing of these RNAs. Other than that, it's pretty much the same between eukaryotes and prokaryotes.
Okay, we have finished talking about transcription. Before we move on to translation, there are a few more things to talk about, but I'll have to talk about those in the next recorded lecture. So this is typically where I finish up with things. For this one, this is the end of recorded lecture 4.1. We'll talk more about genes and proteins and finally get to translation in the next one.