HL IB Biology Protein Synthesis - Comprehensive Notes

Transcription in Protein Synthesis

  • Occurs in two stages:
    • Transcription: DNA is transcribed, producing mRNA.
      • mRNA carries DNA information from the nucleus to the cytoplasm.
      • Requires RNA polymerase.
    • Translation: mRNA is translated, producing an amino acid sequence.

Transcription Process

  • Occurs in the nucleus.
  • DNA molecule unwinds:
    • Hydrogen bonds between complementary base pairs break.
    • Exposes the gene to be transcribed (the gene for a specific polypeptide).
  • mRNA is created:
    • A complementary copy of the gene's code is made by building a single-stranded mRNA molecule.
    • Free RNA nucleotides pair with complementary bases on the DNA template strand via hydrogen bonds.
    • RNA polymerase bonds the sugar-phosphate groups of RNA nucleotides to form the sugar-phosphate backbone of the mRNA molecule.
  • Completion and release:
    • When the gene is transcribed, hydrogen bonds between mRNA and DNA strands break, and the DNA molecule reforms.
    • mRNA leaves the nucleus via a nuclear pore, carrying the genetic message to another part of the cell.
    • DNA is too large to fit through nuclear pores, so it cannot leave the nucleus.

DNA Templates

  • DNA polymerase is for DNA replication; RNA polymerase is for transcription.

Hydrogen Bonding & Complementary Base Pairing

  • RNA nucleotides pair with exposed bases on one DNA strand.
  • RNA has a complementary base sequence to the DNA strand and binds using hydrogen bonds.
  • Adenine (A) in DNA pairs with uracil (U) in RNA because thymine (T) does not exist in RNA.
  • Example:
    • DNA template strand: TAC GGA AGA CTT GGG
    • RNA transcript: AUG CCU UCU GAA CCC
  • Coding strand: strand of DNA with the genetic code.
  • Template strand: the strand used to create the mRNA molecule.
  • mRNA transcript: the template strand is the one that is transcribed to form the mRNA molecule, which is later translated into an amino acid chain.

DNA Templates

  • DNA is stable due to:
    • Hydrogen bonds between DNA bases.
    • Strong phosphodiester bonds between adjacent nucleotides.
  • Single DNA strands serve as reliable templates for transcription over generations.
  • Genetic sequence is conserved in non-dividing somatic cells like neurons and muscle cells.

Transcription & Gene Expression

  • Approximately 20,000 protein-coding genes in the human genome.
  • Not every protein is needed in every cell.
  • Gene expression:
    • Cells switch genes on or off based on requirements.
    • Expressed genes are 'switched on' undergoing transcription and translation.
    • Non-expressed genes are 'switched off' or silenced, and do not undergo transcription and/or translation.
  • Transcription is the key stage in gene expression control.

Translation in Protein Synthesis

Synthesis of Polypeptides

  • Translation uses the genetic code from mRNA to synthesize a polypeptide.
  • A polypeptide: a sequence of amino acids covalently bonded together.
  • The order of amino acids is based on the genetic code in mRNA.
  • Occurs in the cytoplasm.
  • mRNA template comes from transcription.

Roles of RNA & Ribosomes in Translation

  • mRNA attaches to a ribosome after leaving the nucleus.
  • Ribosome:
    • Complex structure made of large and small subunits.
    • Made of proteins and ribosomal RNA (rRNA).
    • Has binding sites for molecules involved in translation:
      • mRNA binds to the small subunit.
      • Two tRNA molecules bind to the large subunit simultaneously.
  • Translation depends on complementary base pairing between codons on mRNA and anticodons on tRNA.
  • tRNA (transfer RNA):
    • Free molecules in the cytoplasm.
    • Bind with specific amino acids.
    • Bring amino acids to the mRNA on the ribosome.
    • The anticodon on tRNA pairs with a complementary codon on mRNA.

Codons & Anticodons

  • Codons: sequences of three mRNA bases that code for a specific amino acid.
  • A triplet is a sequence of three DNA bases that codes for a specific amino acid
  • Anticodon: a sequence of three tRNA bases complementary to a codon.
  • tRNA carries the appropriate amino acid to the ribosome.
  • Stop codons: signal the end of translation.

Structure of tRNA

  • The anticodon is located at the bottom of the tRNA molecule.
  • It consists of three exposed RNA bases.

Analogy

  • Transcription and translation are like converting between languages.
  • Transcription: converting text from English to French (similar alphabets with slight differences).
  • Translation: converting text from a Western language to Japanese (different alphabets). *Complementary base pairing:
    • Adenine (A) pairs with Uracil (U).
    • Cytosine (C) pairs with Guanine (G).
  • Example:
    • mRNA codon: CAG
    • tRNA anticodon: GUC.

The Genetic Code

Features of the Genetic Code

  • The sequence of DNA nucleotide bases within a gene is determined by a triplet code.
  • Each triplet codes for one amino acid.
  • There are 20 different amino acids that cells use to make up different proteins
  • Examples:
    • CAG codes for valine.
    • TTC codes for lysine.
    • GAC codes for leucine.
    • CCG codes for glycine.
  • Some triplets code for start (TAC - methionine) and stop signals.
  • The cell reads the DNA correctly and produces the correct sequences of amino acids (and therefore the correct protein molecules) that it requires to function properly
  • Non-overlapping: each base is only read once in which codon it is part of.
  • Degenerate: multiple codons can code for the same amino acid because there are four bases, so there are 64 different codons (4^3 = 64), yet there are only 20 amino acids.
  • Universal: almost every organism uses the same code.

Deducing Amino Acid Sequences

  • It is possible to determine the sequence of amino acids coded for in the polypeptide by observing the genetic code in the mRNA.

Worked Example

  • DNA coding strand sequence: TTC GAG CAT TAC GCC
  • Step 1: Work out the template sequence using A-T and C-G base pairing rules
    • AAG CTC GTA ATG CGG
  • Step 2: Work out the mRNA codons, complementary to the template strand
    • UUC GAG CAU UAC GCC
  • Step 3: Use the mRNA Codons and Amino Acids Table (above) to work out the first amino acid
    • First base in codon = U, second base = U, third base = C
    • looking in the top-left box of the table; this amino acid is Phe
  • Step 4: Repeat for the remaining 4 codons
    • GAG = Glu
    • CAU = His
    • UAC = Tyr
    • GCC = Ala
  • The final sequence of amino acids is Phe-Glu-His-Tyr-Ala

Elongation of the Polypeptide Chain

  • Two tRNA molecules fit onto the ribosome at a time, bringing amino acids side by side.
  • The ribosome moves along the mRNA molecule, one codon at a time.
  • A peptide bond is then formed (by condensation) between the two amino acids.
  • The formation of a peptide bond between amino acids is an anabolic reaction
  • Requires energy, in the form of ATP
  • ATP is provided by the mitochondria.
  • This process continues until a ‘stop’ codon on the mRNA molecule is reached
  • This acts as a signal for translation to stop and at this point the amino acid chain coded for by the mRNA molecule is complete
  • This amino acid chain is then released from the ribosome and forms the final polypeptide

Protein Structure & Mutations

Protein Structure & Mutations

  • Gene mutation: a change in the sequence of bases in a DNA molecule; this may result in a new allele
  • Mutations occur all the time and occur randomly
  • Mutations are copying errors that take place when DNA is replicated during S phase of interphase
  • Mutations in a gene can lead to a change in the polypeptide for which the gene codes
  • Most mutations are harmful or neutral (have no effect) but some can be beneficial *Inheritance of mutations:
    • Mutations present in normal body cells are not inherited; they are eliminated once the affected cells die
    • Mutations within gametes are inherited by offspring, so can lead to heritable genetic conditions
  • Point mutations are mutations where one base in the DNA sequence is altered; this can result in a changed amino acid at this location

Sickle Cell Disease

  • Sickle cell disease is a genetic disorder caused by a single point mutation within the gene that codes for the alpha-globin polypeptide in haemoglobin (Hb)
  • Most humans have the allele Hb^A
  • The mutation results in a new allele Hb^S
  • Within the haemoglobin gene a point mutation changes the DNA triplet GAG to GTG on the coding strand
  • The resulting DNA triplet(CAC) on the template strand is transcribed into the mRNA codon GUG, instead of GAG
  • During translation the amino acid valine (Val) replaces the original amino acid glutamic acid (Glu)
  • This occurs at the sixth position of the polypeptide
  • The protein haemoglobin S is produced instead of haemoglobin A; this causes a distortion in the shape of red blood cells, resulting in a sickle shape

Sickle-shaped red blood cells:

  • Have a limited oxygen-carrying capacity
    *Block the capillaries and limit the flow of normal red blood cells
  • People with sickle cell anaemia suffer from acute pain, fatigue and anaemia
  • There is a correlation between the global distribution of sickle cell disease and malaria
  • In areas with increased malaria cases there is an increased frequency of sickle cell alleles; this is thought to be due to increased resistance to the malaria parasite in individuals with the Hb^S allele
  • Sickled cells can block the flow of blood through the capillaries, restricting oxygen supply to the tissues

Mechanism of Transcription (HL)

Directionality of Transcription & Translation

  • The synthesis of mRNA occurs in three stages:
    • Initiation
    • Elongation
    • Termination
      *During initiation, RNA polymerase binds near the promoter, causing the DNA strands to separate to form an open complex
  • During elongation, RNA polymerase moves along the template strand
  • RNA polymerase adds the 5‘ end of the free RNA nucleotide to the 3’ end of the growing mRNA molecule
  • Elongation occurs in a 5’ to 3’ direction, synthesising a single strand of RNA
  • Termination occurs when RNA polymerase reaches a terminator sequence Which triggers the detachment of the polymerase enzyme and mRNA strand
  • When the mRNA is translated at the ribosome it is also read in the 5’ to 3’ direction

Initiation of Transcription

  • Gene expression varies in different cells
  • Genes are not expressed equally in every cell
    *Essential genes needed for the survival of an organism are expressed all the time e.g. Genes for the main enzymes in the respiratory pathways or ATP synthase
  • Other genes are only expressed when needed and at levels that make specific amounts of protein e.g. The gene for rhodopsin that is only expressed in light-sensitive receptor cells of the eye
  • Regulatory mechanisms exist to ensure the correct genes are expressed at the correct time
  • These mechanisms are different between prokaryotes and eukaryotes but both employ transcription factors and other proteins that bind to specific sequences in DNA

The function of the promoter

  • Non-coding sequences produce functional RNA molecules like transfer RNA (tRNA) or are involved in the regulation of gene expression such as enhancers, silencers and promoters
  • The promoter is a non-coding sequence located near to a gene
  • The promoter is not itself transcribed
  • The promoter acts as the binding site for RNA Polymerase during the initiation of transcription
  • Binding of RNA Polymerase to the promoter is under the control of various regulatory proteins
  • Eukaryotes regulate gene expression in response to variations in their environment
  • Specific proteins bind to DNA to regulate transcription and ensure that only the genes required are being expressed in the correct cells, at the correct time and to the right level
  • This is key to how processes of cellular differentiation and development in multicellular organisms are controlled
  • General transcription factors are a type of transcription factors that bind directly to the promoter to help initiate transcription
  • This helps RNA polymerase to attach to the promoter and start transcribing the gene
  • In eukaryotes, several general transcription factors are needed for transcription

Non-coding DNA Sequences

  • DNA molecules are very long but only certain regions code for the production of polypeptides These are called coding sequences
  • In humans only 1.5% of the genome contains coding sequences
  • The majority of a eukaryotic genome contains non-coding regions of DNA that do not code for polypeptides but have other important functions
  • Non-coding gene regulatory sequences are involved in the control of gene expression by enhancing or suppressing transcription
  • Non-coding sequences can produce functional RNA molecules like transfer RNA (tRNA) or ribosomal RNA (rRNA)
  • Introns are non-coding sequences of DNA found within genes of eukaryotic organisms
  • Different proteins can be produced from a gene depending on how introns are removed
    *Telomeres are regions of repeated nucleotide sequences at the end of chromosomes that provide protection during cell division
  • The repeated sequence facilitates binding of an RNA primer at the end of the chromosome leading to synthesis of an Okazaki fragment
  • Without telomeres, DNA replication could not continue to the end of the DNA molecule and chromosomes would become shorter after every cell division
  • Nonetheless, telomeres shorten with age due to oxidative damage within cells
  • Loss of telomeres during ageing can be accelerated by smoking, exposure to pollution, obesity, stress and poor diet
  • Antioxidants in the diet are claimed to reduce the rate of telomere shortening

Post-Transcriptional Modification (HL)

Post-Transcriptional Modification

  • In all kingdoms of life, gene expression can be regulated after an mRNA transcript has been produced *Post-transcriptional modification of mRNA:
    • Helps prevent degradation mRNA is single stranded and therefore, inherently unstable
    • Increases the efficiency of protein synthesis
    • In eukaryotes, expands the complexity of the proteome
  • Prokaryotic mRNA does not require any significant post-transcriptional modification as translation can occur immediately which prevents degradation of the mRNA
  • In eukaryotes, transcription and translation occur in separate parts of the cell, allowing for significant post-transcriptional modification to occur
  • In eukaryotes, the immediate product of an mRNA transcript is called pre-mRNA which needs to be modified to form mature mRNA *Three post-transcriptional events must occur
    • A methylated cap is added to the 5' end to protect against degradation by exonucleases
    • A poly-A tail (long chain of adenine nucleotides) is added to the 3' end for further protection and to help the transcript exit the nucleus
    • Non-coding sequences (introns) are removed and coding sequences (exons) are joined together

Alternative Splicing

  • Eukaryotic genes contain both coding and non-coding sequences of DNA *Coding sequences are called exons
    • Non-coding sequences are called introns
      *During transcription the whole gene is transcribed including all introns and exons
  • Introns are not translated as they do not code for amino acids and need to be removed *Before the pre-mRNA exits the nucleus, splicing occurs, during which:
    • Introns (non-coding sections) are removed
      *Exons (coding sections) are joined together
  • The resulting mature mRNA molecule contains only exons and exits the nucleus before joining a ribosome for translation

Alternative splicing

  • The exons (coding regions) of genes can be spliced in many different ways to produce different mature mRNA molecules through alternative splicing
  • A particular exon may or may not be incorporated into the final mature mRNA
  • Polypeptides translated from alternatively spliced mRNAs may differ in their amino acid sequence, structure and function
  • This means that a single eukaryotic gene can code for multiple proteins
  • This is part of the reason why the proteome is much bigger than the genome

Translation & the Proteome (HL)

Initiation of Translation

  • During translation, the specific sequence of messenger RNA (mRNA) is translated to produce a polypeptide chain consisting of amino acids
  • mRNA is a single stranded, linear, RNA molecule that transfers the information in DNA from the nucleus into the cytoplasm
  • Translation is categorised into three stages: initiation, elongation and termination
  • Translation occurs in the cytoplasm at complex molecules made of protein and RNA called ribosomes
    *Ribosomes have a two-subunit (large and small) structure that helps bind mRNA
  • Ribosomes have three tRNA binding sites termed “E” (exit), “P” (peptidyl) and “A” (aminoacyl)
    • At the A site the mRNA codon joins with the tRNA anticodon
    • At the P site the amino acids attached to the tRNA are joined by peptide bonds
    • At the E site the tRNA exits the ribosome
      *Another key molecule in translation is transfer RNA (tRNA) that decodes mRNA
  • tRNA molecules are single stranded RNA molecules that fold to form a clover-shaped structure
    *The folded structure is held together by hydrogen bonds between bases at different points on the strand
  • tRNA molecules are the shortest of the RNA molecules, being only around 80 nucleotides in length
  • There are 20 different types of tRNA molecule, one for each of the amino acids involved in protein synthesis
    *tRNA molecules have a region that binds to a specific amino acid as well as a three-nucleotide region called an anticodon that is complementary to the codon on mRNA
  • The role of tRNA molecule is to carry a specific amino acid to the ribosome

Structure of tRNA

  • In eukaryotic cells, the mRNA molecule leaves the nucleus through the nuclear pores *Translation is initiated by the following process
    • A small ribosomal subunit attaches to the 5’ end of mRNA
    • An initiator tRNA molecule carrying the amino acid methionine binds to the small ribosomal subunit
    • The initiator tRNA occupies the “P” site on the ribosome
    • The ribosome moves along the mRNA until it locates a start codon (AUG)
    • The large ribosomal subunit binds to the small subunit
    • Elongation of the polypeptide can begin
  • The initiator tRNA currently occupies the “P” site, the next codon on the mRNA signals for the corresponding tRNA to bind at the “A” site *The two amino acids (attached to the tRNAs) are linked with a peptide bond, forming a dipeptide *Synthesis of the peptide chain now involves a repeated cycle of events
    • In the cytoplasm, free tRNA molecules bind to their corresponding amino acids and transport them to the ribosome
    • The ribosome shifts along the mRNA one codon (three bases) at a time
    • The initiator tRNA in the “P” site moves to the “E” site which releases it
    • The tRNA carrying the peptide chain moves from the “A” site to the “P” site
  • The next mRNA codon is exposed and a tRNA with the complementary anticodon binds to the unoccupied “A” site whilst its amino acid is linked to the polypeptide chain
  • The cyclical process is repeated as new amino acids are added to the growing chain

Modification of Polypeptides

  • Once the primary structure of the polypeptide has been synthesised during translation it is often not immediately usable by the cell
  • The polypeptide must be modified in order to be transformed into a functional protein *Some examples of modifications include:
    • Protein folding into the secondary, tertiary and quaternary structures, including the formation of disulfide bonds in the tertiary and quaternary stages
    • Folding can require molecular chaperones that help to prevent incorrect folding
  • The formation of insulin requires polypeptide modification
  • When insulin is first synthesised it is in the form of an 110 long polypeptide chain called pre-proinsulin, which is attached to the wall of the endoplasmic reticulum (ER)
  • It is then modified by an enzyme that removes a peptide called a signal peptide from the end, detaching it from the ER and transforming it to proinsulin
    *From there the proinsulin folds and disulfide bonds form between different sections of the polypeptide
  • The proinsulin is packaged into vesicles at the Golgi apparatus
    *The proinsulin is then cleaved (during which a section called the C peptide is removed from the middle) resulting in two chains (A-chain and B-chain) attached together with two disulfide bonds
  • This is the final, mature form of insulin, ready to be secreted from the cell and used in the body

Recycling of Amino Acids

  • Unneeded, damaged, or misfolded proteins can be recycled in the body into usable proteins
  • This involves enzymes to break the peptide bonds in these proteins, and releasing the amino acids to be used in translation to synthesise new proteins
  • Proteases are enzymes that break down proteins in this way
  • This process is called proteolysis
  • The proteasome is an organelle found in eukaryotic cells and acts as the location for proteolysis in the cell
  • By containing the protease enzymes within an organelle it prevents other useful cellular proteins being broken down by mistake
    *Proteins identified as being unneeded, damaged, or misfolded are tagged with a chemical called ubiquitin, which begins the process of them being broken down in the proteasome
  • This process is constantly taking place in the cell and is essential for sustaining a functional proteome