the genetic code — the rules that specify the relationship between sequence of nucleotides and sequence of amino acids in protein
triplet code — 3 base code
codon — three bases that specifies a particular amino acid
reading frame — series of codons that specify a sequence of amino acids
start codon — AUG signals that protein synthesis should begin at that point on the mRNA molecule, codes for methionine
stop codons — UAA, UAG, UGA, do not code for any amino acid
IMPORTANT PROPERTIES
the code is redundant — all amino acids (except methionine and tryptophan) are coded for by more than one codon
code is unambiguous — a given codon never codes for more than one amino acid
code is non-overlapping — once the ribosome reads 1st codon, reading frame is established
code is (nearly) universal — all codons specify the same amino acids in almost all organisms
code is conservative — when several codons code for the same amino acid, the first two bases are usually identical
ribosomes — site of protein synthesis
polyribosome — two or more ribosomes simultaneously translating one mRNA
transcription/translation coupling only occurs in bacteria, because no nuclear envelope
aminoacyl tRNA — tRNA with an amino acid attached
relatively short sequences, about 75-95 nucleotides
forms stem-loop structures with itself
CCA sequence at 3’ end of each tRNA is the site for amino acid attachment
loop opposite of attachment site has an anticodon, antiparallel as well
ATP required to attach an amino acid to tRNA
aminoacyl-tRNA synthetases catalyze the addition of amino acids to tRNAs
for each unique amino acid, there is a different aminoacyl-tRNA synthetase and one or a few tRNAs
each aminoacyl-tRNA synthetase has a binding site for a particular amino acid and a particular tRNA
wobble pairing — nonstandard base pairing between nucleotide in the 3rd position of a codon and corresponding nucleotide in the anticodon of a tRNA
allows one tRNA to read more than one codon, so about 40 tRNAs and translate all 61 codons
introns vary widely in length (tens to tens of thousands of bases)
occurs post-transcriptionally (after RNA polymerase finishes that particular section), but usually concurrently (RNA polymerase doesn’t have to have finished the whole mRNA segment yet)
must be completed prior to mRNA export out of the nucleus
conserved sequences at both ends mark splice pointsl rest of intron sequence not directly part of removal but may play regulatory roles
carried out by combinations of small nuclear RNA molecules and proteins (snRNPS), which precisely remove intron and join ends of exons
steps
U1 snRNP binds to pre-mRNA (at 5’ splice point)
U2 snRNP binds to pre-mRNA closer to 3’ end
U4/U6 and U5 snRNPs join
RNA is cleaved at 5’ splice site and lariat is formed
RNA is cleaved at 3’ splice site, and the two exons are joined
excised intron is degraded to nucleotides
inaccurate splicing leads to defective proteins
thalassemias (defective hemoglobin) often caused by inaccurate splicing of introns
mutations in DNA carried over to RNA, and intron/exon junction not recognized by snRNPs
intron not removed
“false” splice sites then used by snRNPs, removing necessary exon sequences → defective protein
human genes assemble their coding regions in an extensive array of combinations
just because all exons are present in the primary mRNA does not mean they all have to be used
different mature mRNA = different protein
between 50%-90% of all human genes are alternatively spliced
number of different transcripts from a given gene ranges from 2 to hundreds or thousands in humans (more in other species)
average in humans is 3-4 different transcripts per gene
accounts for 100,000 proteins from 21,000 genes
so genome size and complexity do not have correlation in eukaryotes
modifying 5’ end of mRNA:
RNA 5’-triphosphatase removes a phosphate group
guanylyl transferase hydrolyzes GTP, removing two phosphate groups. GMP is attached to 5’ end, and the PPi is released
methyltransferase attaches a methyl group, creating a 7-methylguanosine cap
so there is three phosphate groups, but they are not free, thereby preventing nucleases from degrading the 5’ end of mRNA
modifying 3’ end of mRNA:
poly(A) tail added to 3’ end posttranscriptionally by poly-A polymerase
basically the 3’ end is cleaved by ribonuclease as the RNA polymerase is done transcriptioning, poly-A polymerase adds As onto 3’ end, creating a poly(A) tail
function of 5’ cap:
allows recognition of “start signal” for translation
provides some stability to mRNA (protection from degradation by RNases)
function of poly-A tail
provides some (temporary) stability to mRNA (protect it from degradation by RNases)
aids in allowing multiple copies of a protein to be efficiently made from a single eukaryotic mRNA
DNA template has exon-intron-exon-intron-exon…
coding strand is 5’ → 3’
template strand is 3’ → 5’
transcription creates an initial transcript 5’ → 3’
excision, splicing the introns out
capping at 5’ end
poly-A addition at 3’ end
mature RNA: 5’-(m^7Gppp cap)—joined exons—(poly A tail)-3’
in bacteria
all of this is happening in cytosol
DNA -(transcription)→ mRNA → ribosome -(translation)→ polypeptide
in eukaryotes
in nucleus: DNA -(transcription)→ pre-mRNA -(RNA processing)→ mRNA
export mRNA out the nucleus into the cytosol
mRNA → ribosome -(translation)→ polypeptide
nucleotide sequence (ACGU) of mRNA has information for making protein
but proteins are made of amino acids
3 nucleotides per amino acid is the minimum nunber to account for the 20 amino acids we already know about
related experiments uncovered what each of the 64 codons specified
61 specify an amino acid
3 specify the end of the protein (stop codons)
if 61 specify amino acids, and there are only 20 amino acids, what are the 41 used for
genetic code is degenerate or redundant
most amino acids can be specified by multiple codons
genetic code is unambiguous — one codon can only specify one amino acid
genetic code is non-overlapping