Transcription and mRNA processing | Biomolecules | MCAT | Khan Academy
What we're going to do in this video is a little bit of a deep dive on transcription and just as a bit of a review, we touch on it in the video on replication, transcription, and translation. Transcription in everyday language just means to rewrite something or to rewrite some information in another form, and that's essentially what's happening here. Transcription is when we take the information encoded in a gene in DNA and encode essentially that same information in mRNA.
So, transcription is going from DNA to messenger RNA. In this video, we'll focus on genes that code for proteins. This first step is the transcription, the DNA to messenger RNA, and then in a future video we'll dig a little bit deeper into translation. We will translate that information into an actual protein.
These diagrams give a little bit of an overview of it. It's a little bit simpler in bacteria. You have the DNA just floating around in the cytosol, and so transcription takes place. You start with that DNA, that protein coding gene in the DNA, and then from that, you code the messenger RNA, which you see in that purple color right over here.
Then that messenger RNA can be involved with the ribosome, and that's the translation process to actually produce the polypeptide, to produce the protein. In eukaryotic cells, we're going to get into a little bit more depth in this video about the transcription, the DNA to mRNA that happens inside the nucleus.
There are essentially two steps here. You go from DNA to what we would call pre-mRNA. Let me write that down: pre-mRNA, which is depicted right over there. Then it needs to be processed to turn into what we would call mRNA, which then can leave the nucleus to be translated into a protein.
So now that we have that overview, let's dig a little bit deeper into this and understand the different actors and understand if we're talking about a eukaryotic cell what type of processing might actually go on. Right over here, we are going to start with the protein coding gene inside of the DNA, right over here, and the primary actor that's not the DNA or the mRNA here is going to be RNA polymerase.
It's used to create a sequence that will become a nucleotide sequence that will become the messenger RNA. This RNA polymerase needs to know where to start, and the way it knows where to start is it attaches to a sequence of the DNA known as a promoter. Every gene is going to have a promoter associated with it, especially if we're talking about eukaryotic cells. Sometimes, you might have a promoter associated with a collection of genes as well.
But in general, if you've got a gene, you're going to have a promoter, and that's how the RNA polymerase knows to attach right over there. Once it attaches, well then it is able to separate the strands. It separates the strands, and it's pretty interesting because when we went deep into replication, you saw all of these actors, the helicase and whatever else.
But this RNA polymerase complex is actually quite capable. Not only does it separate the strand, but it’s also actually able to code for the RNA, and it does that the same way that when we study DNA polymerase: it does it in only one direction. It can only add more nucleotides on the three-prime end, so it encodes from the five-prime to the three-prime direction.
Notice this arrow here; we're extending it on the three-prime end of the RNA. As you can see here, when it does this, it's only encoding one side of—or it's only interacting, I guess you could say, or coding complementary information to one side. But let's think about this a little bit.
We could call the side that it is forming, that is interacting with, the template strand. That's forming; that side of the DNA is acting as the template for forming that RNA. If you think about the information that that RNA is actually going to encode, it's going to contain the same information as the coding strand of DNA, the other strand of DNA.
These nucleotides right over here, this nucleotide, is going to be complementary to this one over here, just as this complementary nucleotide was complementary to that one over there. You can see it in a little bit more depth if we actually were to add the nucleotides.
So this is the template strand; if you have a thymine on the RNA, you would have adenine. And look, on the coding strand of DNA, the one up here, you would also have an adenine. They are essentially the coding strand and the RNA, and they essentially end up being the same sequence. The one difference is that you won't find thymine in the RNA; instead, you will find a similar nitrogenous base, and that is uracil.
But uracil plays the role of thymine, so you're essentially coding the same information. Once again, this bottom strand is acting as a template, but the resulting RNA that gets coded essentially has the same information that we had in the coding strand.
Just to get an appreciation for what this looks like—and I would even put "looks" in quotations—I even did a little quote thing with my fingers when I said that. It’s hard to really visualize what these things look like, but you can see here that the RNA polymerase complex, and this is for a specific organism, can be very, very complex and involved. It’s fascinating how these things interact.
Every time you're studying biology and someone like me is going to give you these nice, clean narratives of how these enzymes interact with the different macromolecules like the DNA or the RNA, you should always remember that this is amazing. These are molecules interacting with each other, bouncing into each other; it's happening incredibly fast inside of the cell.
You should be in awe of this; it's happening in all of your cells as we speak. So this is pretty incredible stuff. The next thing you have to think about is: when does this thing actually stop? It stops once we get to this area, which we've labeled as a terminator.
So let me write that down: this area is a terminator. There are multiple ways that signal to the RNA polymerase that, "Hey, it's time to stop." More particularly, it somehow creates something structurally that the polymerase just lets go. One mechanism that's depicted right over here is that the mRNA that's coded—and this is typical or can happen in bacteria—forms a hairpin.
It has to have the right complementary base pairs right over here to form this hairpin, but this hairpin, along with the things around the hairpin, essentially makes it impair the polymerase to keep on going. The complex kind of changes a little bit, and so it lets go—or at least that's how people believe it.
There are other forms of how the terminator can act; it might be sequences that parts of the polymerase complex recognize, and it forms a confirmation change so that the RNA polymerase lets go. Now, if we're talking about a prokaryote, we're done; we would have formed this would be our messenger RNA, which then can go to a ribosome and then be translated into a protein.
But if we're talking about a eukaryote, then we have to do a little bit of processing. If this is a prokaryote right over here, this would be our mRNA; if this is a eukaryote, then this is our pre-mRNA, which now has to be processed.
You might say, "Well, how is that going to be processed?" Well, there are a couple of things that are going to be done. Some things are going to be added at the beginning and the end of the mRNA. The five-prime cap, this is modified guanine, which is going to help in the translation process as the ribosomes attach onto it.
Then you have this poly-A tail, and it's called a poly-A tail because it has a bunch of adenine at the end right over here. These not only help in the translation process, but they also help make sure that the information is more robust, making it less likely that the ends of the mRNA will become damaged.
Now, the other thing that needs to be processed—and this is one of those fascinating things in evolutionary biology—is that in this mRNA sequence, you're going to have parts of the sequence which we currently consider to be nonsense sequences, known as introns.
I'm going to put it in quotes because, in general, in evolution it's seldom that things have absolutely no purpose. But these are not coding for the protein that is going to be coded by our initial gene, and so these are actually processed out; they are spliced out.
I'm not going to go into all the details of the actors that cause the splicing, but as part of this eukaryotic processing, you add the cap, you add the tail, and then you splice out the introns. Once you've spliced out the introns, all you have left are the exons.
So you have that, which is going to be connected, and that is what you have resulted in. In a eukaryote, you will have this mature mRNA, which is what we saw right over here. It can then, let me underline that in a color, migrate out of the nucleus to a ribosome where it can be translated.