HL IB Biology Protein Synthesis - Comprehensive Notes

Transcription in Protein Synthesis

Occurs in two stages:
- Transcription: DNA is transcribed, producing mRNA.
  - mRNA carries DNA information from the nucleus to the cytoplasm.
  - Requires RNA polymerase.
- Translation: mRNA is translated, producing an amino acid sequence.

Transcription Process

Occurs in the nucleus.
DNA molecule unwinds:
- Hydrogen bonds between complementary base pairs break.
- Exposes the gene to be transcribed (the gene for a specific polypeptide).
mRNA is created:
- A complementary copy of the gene's code is made by building a single-stranded mRNA molecule.
- Free RNA nucleotides pair with complementary bases on the DNA template strand via hydrogen bonds.
- RNA polymerase bonds the sugar-phosphate groups of RNA nucleotides to form the sugar-phosphate backbone of the mRNA molecule.
Completion and release:
- When the gene is transcribed, hydrogen bonds between mRNA and DNA strands break, and the DNA molecule reforms.
- mRNA leaves the nucleus via a nuclear pore, carrying the genetic message to another part of the cell.
- DNA is too large to fit through nuclear pores, so it cannot leave the nucleus.

DNA Templates

DNA polymerase is for DNA replication; RNA polymerase is for transcription.

Hydrogen Bonding & Complementary Base Pairing

RNA nucleotides pair with exposed bases on one DNA strand.
RNA has a complementary base sequence to the DNA strand and binds using hydrogen bonds.
Adenine (A) in DNA pairs with uracil (U) in RNA because thymine (T) does not exist in RNA.
Example:
- DNA template strand: TAC GGA AGA CTT GGG
- RNA transcript: AUG CCU UCU GAA CCC
Coding strand: strand of DNA with the genetic code.
Template strand: the strand used to create the mRNA molecule.
mRNA transcript: the template strand is the one that is transcribed to form the mRNA molecule, which is later translated into an amino acid chain.

DNA Templates

DNA is stable due to:
- Hydrogen bonds between DNA bases.
- Strong phosphodiester bonds between adjacent nucleotides.
Single DNA strands serve as reliable templates for transcription over generations.
Genetic sequence is conserved in non-dividing somatic cells like neurons and muscle cells.

Transcription & Gene Expression

Approximately 20,000 protein-coding genes in the human genome.
Not every protein is needed in every cell.
Gene expression:
- Cells switch genes on or off based on requirements.
- Expressed genes are 'switched on' undergoing transcription and translation.
- Non-expressed genes are 'switched off' or silenced, and do not undergo transcription and/or translation.
Transcription is the key stage in gene expression control.

Translation in Protein Synthesis

Synthesis of Polypeptides

Translation uses the genetic code from mRNA to synthesize a polypeptide.
A polypeptide: a sequence of amino acids covalently bonded together.
The order of amino acids is based on the genetic code in mRNA.
Occurs in the cytoplasm.
mRNA template comes from transcription.

Roles of RNA & Ribosomes in Translation

mRNA attaches to a ribosome after leaving the nucleus.
Ribosome:
- Complex structure made of large and small subunits.
- Made of proteins and ribosomal RNA (rRNA).
- Has binding sites for molecules involved in translation:
  - mRNA binds to the small subunit.
  - Two tRNA molecules bind to the large subunit simultaneously.
Translation depends on complementary base pairing between codons on mRNA and anticodons on tRNA.
tRNA (transfer RNA):
- Free molecules in the cytoplasm.
- Bind with specific amino acids.
- Bring amino acids to the mRNA on the ribosome.
- The anticodon on tRNA pairs with a complementary codon on mRNA.

Codons & Anticodons

Codons: sequences of three mRNA bases that code for a specific amino acid.
A triplet is a sequence of three DNA bases that codes for a specific amino acid
Anticodon: a sequence of three tRNA bases complementary to a codon.
tRNA carries the appropriate amino acid to the ribosome.
Stop codons: signal the end of translation.

Structure of tRNA

The anticodon is located at the bottom of the tRNA molecule.
It consists of three exposed RNA bases.

Analogy

Transcription and translation are like converting between languages.
Transcription: converting text from English to French (similar alphabets with slight differences).
Translation: converting text from a Western language to Japanese (different alphabets). *Complementary base pairing:
- Adenine (A) pairs with Uracil (U).
- Cytosine (C) pairs with Guanine (G).
Example:
- mRNA codon: CAG
- tRNA anticodon: GUC.

The Genetic Code

Features of the Genetic Code

The sequence of DNA nucleotide bases within a gene is determined by a triplet code.
Each triplet codes for one amino acid.
There are 20 different amino acids that cells use to make up different proteins
Examples:
- CAG codes for valine.
- TTC codes for lysine.
- GAC codes for leucine.
- CCG codes for glycine.
Some triplets code for start (TAC - methionine) and stop signals.
The cell reads the DNA correctly and produces the correct sequences of amino acids (and therefore the correct protein molecules) that it requires to function properly
Non-overlapping: each base is only read once in which codon it is part of.
Degenerate: multiple codons can code for the same amino acid because there are four bases, so there are 64 different codons (4^3 = 64), yet there are only 20 amino acids.
Universal: almost every organism uses the same code.

Deducing Amino Acid Sequences

It is possible to determine the sequence of amino acids coded for in the polypeptide by observing the genetic code in the mRNA.

Worked Example

DNA coding strand sequence: TTC GAG CAT TAC GCC
Step 1: Work out the template sequence using A-T and C-G base pairing rules
- AAG CTC GTA ATG CGG
Step 2: Work out the mRNA codons, complementary to the template strand
- UUC GAG CAU UAC GCC
Step 3: Use the mRNA Codons and Amino Acids Table (above) to work out the first amino acid
- First base in codon = U, second base = U, third base = C
- looking in the top-left box of the table; this amino acid is Phe
Step 4: Repeat for the remaining 4 codons
- GAG = Glu
- CAU = His
- UAC = Tyr
- GCC = Ala
The final sequence of amino acids is Phe-Glu-His-Tyr-Ala

Elongation of the Polypeptide Chain

Two tRNA molecules fit onto the ribosome at a time, bringing amino acids side by side.
The ribosome moves along the mRNA molecule, one codon at a time.
A peptide bond is then formed (by condensation) between the two amino acids.
The formation of a peptide bond between amino acids is an anabolic reaction
Requires energy, in the form of ATP
ATP is provided by the mitochondria.
This process continues until a ‘stop’ codon on the mRNA molecule is reached
This acts as a signal for translation to stop and at this point the amino acid chain coded for by the mRNA molecule is complete
This amino acid chain is then released from the ribosome and forms the final polypeptide

Protein Structure & Mutations

Gene mutation: a change in the sequence of bases in a DNA molecule; this may result in a new allele
Mutations occur all the time and occur randomly
Mutations are copying errors that take place when DNA is replicated during S phase of interphase
Mutations in a gene can lead to a change in the polypeptide for which the gene codes
Most mutations are harmful or neutral (have no effect) but some can be beneficial *Inheritance of mutations:
- Mutations present in normal body cells are not inherited; they are eliminated once the affected cells die
- Mutations within gametes are inherited by offspring, so can lead to heritable genetic conditions
Point mutations are mutations where one base in the DNA sequence is altered; this can result in a changed amino acid at this location

Sickle Cell Disease

Sickle cell disease is a genetic disorder caused by a single point mutation within the gene that codes for the alpha-globin polypeptide in haemoglobin (Hb)
Most humans have the allele Hb^A
The mutation results in a new allele Hb^S
Within the haemoglobin gene a point mutation changes the DNA triplet GAG to GTG on the coding strand
The resulting DNA triplet(CAC) on the template strand is transcribed into the mRNA codon GUG, instead of GAG
During translation the amino acid valine (Val) replaces the original amino acid glutamic acid (Glu)
This occurs at the sixth position of the polypeptide
The protein haemoglobin S is produced instead of haemoglobin A; this causes a distortion in the shape of red blood cells, resulting in a sickle shape

Sickle-shaped red blood cells:

Have a limited oxygen-carrying capacity
*Block the capillaries and limit the flow of normal red blood cells
People with sickle cell anaemia suffer from acute pain, fatigue and anaemia
There is a correlation between the global distribution of sickle cell disease and malaria
In areas with increased malaria cases there is an increased frequency of sickle cell alleles; this is thought to be due to increased resistance to the malaria parasite in individuals with the Hb^S allele
Sickled cells can block the flow of blood through the capillaries, restricting oxygen supply to the tissues

Mechanism of Transcription (HL)

Directionality of Transcription & Translation

The synthesis of mRNA occurs in three stages:
- Initiation
- Elongation
- Termination
  *During initiation, RNA polymerase binds near the promoter, causing the DNA strands to separate to form an open complex
During elongation, RNA polymerase moves along the template strand
RNA polymerase adds the 5‘ end of the free RNA nucleotide to the 3’ end of the growing mRNA molecule
Elongation occurs in a 5’ to 3’ direction, synthesising a single strand of RNA
Termination occurs when RNA polymerase reaches a terminator sequence Which triggers the detachment of the polymerase enzyme and mRNA strand
When the mRNA is translated at the ribosome it is also read in the 5’ to 3’ direction

Initiation of Transcription

Gene expression varies in different cells
Genes are not expressed equally in every cell
*Essential genes needed for the survival of an organism are expressed all the time e.g. Genes for the main enzymes in the respiratory pathways or ATP synthase
Other genes are only expressed when needed and at levels that make specific amounts of protein e.g. The gene for rhodopsin that is only expressed in light-sensitive receptor cells of the eye
Regulatory mechanisms exist to ensure the correct genes are expressed at the correct time
These mechanisms are different between prokaryotes and eukaryotes but both employ transcription factors and other proteins that bind to specific sequences in DNA

The function of the promoter

Non-coding sequences produce functional RNA molecules like transfer RNA (tRNA) or are involved in the regulation of gene expression such as enhancers, silencers and promoters
The promoter is a non-coding sequence located near to a gene
The promoter is not itself transcribed
The promoter acts as the binding site for RNA Polymerase during the initiation of transcription
Binding of RNA Polymerase to the promoter is under the control of various regulatory proteins
Eukaryotes regulate gene expression in response to variations in their environment
Specific proteins bind to DNA to regulate transcription and ensure that only the genes required are being expressed in the correct cells, at the correct time and to the right level
This is key to how processes of cellular differentiation and development in multicellular organisms are controlled
General transcription factors are a type of transcription factors that bind directly to the promoter to help initiate transcription
This helps RNA polymerase to attach to the promoter and start transcribing the gene
In eukaryotes, several general transcription factors are needed for transcription

Non-coding DNA Sequences

DNA molecules are very long but only certain regions code for the production of polypeptides These are called coding sequences
In humans only 1.5% of the genome contains coding sequences
The majority of a eukaryotic genome contains non-coding regions of DNA that do not code for polypeptides but have other important functions
Non-coding gene regulatory sequences are involved in the control of gene expression by enhancing or suppressing transcription
Non-coding sequences can produce functional RNA molecules like transfer RNA (tRNA) or ribosomal RNA (rRNA)
Introns are non-coding sequences of DNA found within genes of eukaryotic organisms
Different proteins can be produced from a gene depending on how introns are removed
*Telomeres are regions of repeated nucleotide sequences at the end of chromosomes that provide protection during cell division
The repeated sequence facilitates binding of an RNA primer at the end of the chromosome leading to synthesis of an Okazaki fragment
Without telomeres, DNA replication could not continue to the end of the DNA molecule and chromosomes would become shorter after every cell division
Nonetheless, telomeres shorten with age due to oxidative damage within cells
Loss of telomeres during ageing can be accelerated by smoking, exposure to pollution, obesity, stress and poor diet
Antioxidants in the diet are claimed to reduce the rate of telomere shortening

Post-Transcriptional Modification (HL)

Post-Transcriptional Modification

In all kingdoms of life, gene expression can be regulated after an mRNA transcript has been produced *Post-transcriptional modification of mRNA:
- Helps prevent degradation mRNA is single stranded and therefore, inherently unstable
- Increases the efficiency of protein synthesis
- In eukaryotes, expands the complexity of the proteome
Prokaryotic mRNA does not require any significant post-transcriptional modification as translation can occur immediately which prevents degradation of the mRNA
In eukaryotes, transcription and translation occur in separate parts of the cell, allowing for significant post-transcriptional modification to occur
In eukaryotes, the immediate product of an mRNA transcript is called pre-mRNA which needs to be modified to form mature mRNA *Three post-transcriptional events must occur
- A methylated cap is added to the 5' end to protect against degradation by exonucleases
- A poly-A tail (long chain of adenine nucleotides) is added to the 3' end for further protection and to help the transcript exit the nucleus
- Non-coding sequences (introns) are removed and coding sequences (exons) are joined together

Alternative Splicing

Eukaryotic genes contain both coding and non-coding sequences of DNA *Coding sequences are called exons
- Non-coding sequences are called introns
  *During transcription the whole gene is transcribed including all introns and exons
Introns are not translated as they do not code for amino acids and need to be removed *Before the pre-mRNA exits the nucleus, splicing occurs, during which:
- Introns (non-coding sections) are removed
  *Exons (coding sections) are joined together
The resulting mature mRNA molecule contains only exons and exits the nucleus before joining a ribosome for translation

Alternative splicing

The exons (coding regions) of genes can be spliced in many different ways to produce different mature mRNA molecules through alternative splicing
A particular exon may or may not be incorporated into the final mature mRNA
Polypeptides translated from alternatively spliced mRNAs may differ in their amino acid sequence, structure and function
This means that a single eukaryotic gene can code for multiple proteins
This is part of the reason why the proteome is much bigger than the genome

Translation & the Proteome (HL)

Initiation of Translation

During translation, the specific sequence of messenger RNA (mRNA) is translated to produce a polypeptide chain consisting of amino acids
mRNA is a single stranded, linear, RNA molecule that transfers the information in DNA from the nucleus into the cytoplasm
Translation is categorised into three stages: initiation, elongation and termination
Translation occurs in the cytoplasm at complex molecules made of protein and RNA called ribosomes
*Ribosomes have a two-subunit (large and small) structure that helps bind mRNA
Ribosomes have three tRNA binding sites termed “E” (exit), “P” (peptidyl) and “A” (aminoacyl)
- At the A site the mRNA codon joins with the tRNA anticodon
- At the P site the amino acids attached to the tRNA are joined by peptide bonds
- At the E site the tRNA exits the ribosome
  *Another key molecule in translation is transfer RNA (tRNA) that decodes mRNA
tRNA molecules are single stranded RNA molecules that fold to form a clover-shaped structure
*The folded structure is held together by hydrogen bonds between bases at different points on the strand
tRNA molecules are the shortest of the RNA molecules, being only around 80 nucleotides in length
There are 20 different types of tRNA molecule, one for each of the amino acids involved in protein synthesis
*tRNA molecules have a region that binds to a specific amino acid as well as a three-nucleotide region called an anticodon that is complementary to the codon on mRNA
The role of tRNA molecule is to carry a specific amino acid to the ribosome

Structure of tRNA

In eukaryotic cells, the mRNA molecule leaves the nucleus through the nuclear pores *Translation is initiated by the following process
- A small ribosomal subunit attaches to the 5’ end of mRNA
- An initiator tRNA molecule carrying the amino acid methionine binds to the small ribosomal subunit
- The initiator tRNA occupies the “P” site on the ribosome
- The ribosome moves along the mRNA until it locates a start codon (AUG)
- The large ribosomal subunit binds to the small subunit
- Elongation of the polypeptide can begin
The initiator tRNA currently occupies the “P” site, the next codon on the mRNA signals for the corresponding tRNA to bind at the “A” site *The two amino acids (attached to the tRNAs) are linked with a peptide bond, forming a dipeptide *Synthesis of the peptide chain now involves a repeated cycle of events
- In the cytoplasm, free tRNA molecules bind to their corresponding amino acids and transport them to the ribosome
- The ribosome shifts along the mRNA one codon (three bases) at a time
- The initiator tRNA in the “P” site moves to the “E” site which releases it
- The tRNA carrying the peptide chain moves from the “A” site to the “P” site
The next mRNA codon is exposed and a tRNA with the complementary anticodon binds to the unoccupied “A” site whilst its amino acid is linked to the polypeptide chain
The cyclical process is repeated as new amino acids are added to the growing chain

Modification of Polypeptides

Once the primary structure of the polypeptide has been synthesised during translation it is often not immediately usable by the cell
The polypeptide must be modified in order to be transformed into a functional protein *Some examples of modifications include:
- Protein folding into the secondary, tertiary and quaternary structures, including the formation of disulfide bonds in the tertiary and quaternary stages
- Folding can require molecular chaperones that help to prevent incorrect folding
The formation of insulin requires polypeptide modification
When insulin is first synthesised it is in the form of an 110 long polypeptide chain called pre-proinsulin, which is attached to the wall of the endoplasmic reticulum (ER)
It is then modified by an enzyme that removes a peptide called a signal peptide from the end, detaching it from the ER and transforming it to proinsulin
*From there the proinsulin folds and disulfide bonds form between different sections of the polypeptide
The proinsulin is packaged into vesicles at the Golgi apparatus
*The proinsulin is then cleaved (during which a section called the C peptide is removed from the middle) resulting in two chains (A-chain and B-chain) attached together with two disulfide bonds
This is the final, mature form of insulin, ready to be secreted from the cell and used in the body

Recycling of Amino Acids

Unneeded, damaged, or misfolded proteins can be recycled in the body into usable proteins
This involves enzymes to break the peptide bonds in these proteins, and releasing the amino acids to be used in translation to synthesise new proteins
Proteases are enzymes that break down proteins in this way
This process is called proteolysis
The proteasome is an organelle found in eukaryotic cells and acts as the location for proteolysis in the cell
By containing the protease enzymes within an organelle it prevents other useful cellular proteins being broken down by mistake
*Proteins identified as being unneeded, damaged, or misfolded are tagged with a chemical called ubiquitin, which begins the process of them being broken down in the proteasome
This process is constantly taking place in the cell and is essential for sustaining a functional proteome