The Phases and Mechanisms of Transcription in Prokaryotes and Eukaryotes

Fundamental Principles of Transcription

Transcription is the biological process of synthesizing an RNA molecule using a DNA sequence as the primary template, facilitated by the enzyme RNA polymerase. This synthesis proceeds strictly in the 5 to 35' \text{ to } 3' direction. The RNA transcript is complementary to one specific strand of the DNA known as the template strand, non-coding strand, or antisense strand. Conversely, the other DNA strand is referred to as the non-template, coding, or sense strand, as its sequence matches the RNA transcript (with the exception that DNA contains Thymine and RNA contains Uracil).

Components of the Transcription Unit

A transcription unit is a specific stretch of DNA that serves as the template for a single RNA molecule, extending from a promoter to a terminator. The gene itself contains all technical elements required for transcription and subsequent translation. The promoter is a localized sequence where RNA polymerase binds to initiate the process. It includes the first base pair to be transcribed, known as the transcription start point or transcription start site (TSP\text{TSP} or TSS\text{TSS}, designated as the +1+1 position), as well as specific surrounding base pairs. Regions upstream of the gene (relative to the transcription direction) are typically involved in regulation, while regions downstream contain the coding sequence.

Untranslated Regions and Open Reading Frames

Within the transcript, there are regions that are transcribed but not translated into protein. The region between the +1+1 site and the start codon (AUG\text{AUG}, coding for Methionine) is known as the 5 UTR5' \text{ UTR} (untranslated region). The region located between the stop codon (UAG\text{UAG}, UAA\text{UAA}, or UGA\text{UGA}) and the terminator sequence is the 3 UTR3' \text{ UTR}. The actual genetic sequence to be translated is the Open Reading Frame (ORF\text{ORF}), which consists of a continuous stretch of codons starting with a start codon and ending with a stop codon.

Prokaryotic and Eukaryotic RNA Organization

In prokaryotes, transcripts can be either monocistronic or polycistronic. Monocistronic RNA encodes a single gene or cistron. Polycistronic RNA, which is common in prokaryotic operons, encodes multiple genes under the control of a single promoter. This organization allows for a coordinated cellular response, where all genes for a specific signaling pathway can be activated simultaneously. However, a drawback is that these genes cannot be regulated independently as they share the same binding factors. Eukaryotic RNA is almost exclusively monocistronic. Eukaryotic mRNA features a 5 m7Gppp5' \text{ m}^7\text{Gppp} cap and a 3 poly(A)3' \text{ poly(A)} tail, which protect the molecule from exonuclease degradation. DNA does not require these protections due to chromatin folding, base stacking, and base-pairing interactions that render it inaccessible to many enzymes.

Structural Divergence in Prokaryotic and Eukaryotic Transcription

Bacterial transcription occurs directly on the DNA template within the cytoplasm. This environment allows for the unique coupling of transcription and translation; ribosomes can begin translating the 55' end of an mRNA molecule before the RNA polymerase has even finished transcribing the 33' end. In contrast, eukaryotic transcription takes place on a chromatin template in the nucleus. This template must be in an open, accessible form, making the process slower than in prokaryotes. Eukaryotic RNA must undergo significant post-transcriptional processing, including 55' capping, splicing (intron removal), editing, and polyadenylation, before being transported to the cytoplasm for translation.

RNA Polymerases and Subunits

Prokaryotic organisms utilize a single RNA polymerase to transcribe all classes of RNA (mRNA\text{mRNA}, rRNA\text{rRNA}, and tRNA\text{tRNA}). The bacterial core enzyme has the composition a2ββw\text{a}_2\beta\beta'\text{w}, representing two copies of the a\text{a} subunit and one each of β\beta, β\beta', and w\text{w}. When the sigma (σ\text{σ}) subunit joins this core, it forms the holoenzyme (a2ββ\text{a}_2\beta\beta'\text{wσ}). The sigma factor is essential for the initiation of transcription because it specifically recognizes the promoter sequence; the core enzyme alone cannot do so.

Eukaryotes possess three specialized RNA polymerases:

  1. RNA Pol I: Synthesizes large ribosomal RNAs (large rRNA\text{rRNA}), accounting for more than 50 \text{ %} of total cellular RNA. These include the 28S28S, 18S18S, and 5.8S5.8S subunits (compared to the prokaryotic 23S23S and 16S16S).
  2. RNA Pol II: Synthesizes messenger RNA (mRNA\text{mRNA}) and some small nuclear RNAs (snRNA\text{snRNA}). This is considered the most critical polymerase for gene expression regulation.
  3. RNA Pol III: Synthesizes transfer RNA (tRNA\text{tRNA}) and small ribosomal RNAs such as the 5S rRNA5S \text{ rRNA}.

Unlike bacterial polymerase, eukaryotic polymerases cannot read DNA sequences independently. They require a large suite of General Transcription Factors (GTFs\text{GTFs}) to bind to the promoter first to form the basal transcriptional apparatus. This system prevents transcription from occurring in locations where it is not required.

The Transcription Cycle Phase I: Initiation and Isomerization

Transcription begins with the initial binding of RNA polymerase to the promoter to form the closed complex. In this state, the DNA remains double-stranded, and the enzyme is bound to one face of the helix. The transition from a closed complex to an open complex is known as isomerization. During this stage, the DNA strands separate around the start site, creating a transcription bubble and freeing the template strand. This process is referred to as promoter melting. Notably, isomerization does not require energy from ATP\text{ATP} hydrolysis; it is a spontaneous conformational change into a more energetically favorable and stable form.

The Transcription Cycle Phase II: Abortive Synthesis and Promoter Escape

The initial transcribing complex brings the first two ribonucleotides into the active site and joins them via de novo synthesis. The early stage of transcription is inefficient and often results in abortive synthesis, where the enzyme releases short transcripts of fewer than 1010 nucleotides. Once the enzyme successfully synthesizes a transcript longer than 1010 nucleotides, it is said to have escaped the promoter. Promoter escape involves breaking all interactions between the holoenzyme and the promoter, as well as the interactions between the polymerase core and the σ\text{σ} subunit. Following escape, a stable ternary complex is formed, consisting of the enzyme, the DNA, and the RNA transcript.

The Transcription Cycle Phase III: Elongation and Proofreading

Once the polymerase reaches the elongation phase, it functions as a highly efficient, processive molecular motor. For every nucleotide added to the growing chain, the enzyme steps forward one position. The elongating polymerase unwinds the DNA in front of it and reanneals the strands behind it, while simultaneously dissociating the growing RNA chain from the template.

Transcription is generally less accurate than replication, with an error rate of approximately 1 error in 104 nucleotides1 \text{ error in } 10^4 \text{ nucleotides} compared to 1 in 1071 \text{ in } 10^7 for DNA replication. This is acceptable because RNA transcripts are transient and do not serve as permanent templates for the entire genome. Elongating polymerase performs specific proofreading functions. Interestingly, most proofreading does not occur during initiation because the first 10 nucleotides10 \text{ nucleotides} are often part of the UTR and will not be translated; precise proofreading is primarily prioritized once the start codon is reached.

Comparisons with DNA Replication

Several key differences exist between transcription and replication:

  1. Transcription uses ribonucleotides (containing Ribose and Uracil) instead of deoxyribonucleotides.
  2. RNA polymerase can initiate synthesis de novo and does not require a primer.
  3. The resulting RNA product does not remain base-paired to the DNA template.
  4. Multiple RNA polymerase molecules can transcribe a single gene simultaneously, allowing a cell to generate large numbers of transcripts in a very short duration.
  5. Replication copies the entire genome, whereas transcription only copies specific portions of the genome.