genetics
Introduction to Nucleic Acids
There are two types of nucleic acids: Ribonucleic (RNA) and Deoxyribonucleic (DNA) Allow organisms to reproduce their complex equipment from one gen to the next DNA can reproduce itself, the basis for life on Earth DNA contains instructions for the cell's activities but does not run it. RNA is involved in translating the genetic code into a protein.
Nucleotides Nucleic acids are made of monomers called nucleotides with 3 parts: Phosphate group, Pentose (5C sugar), and Nitrogenous Base 2 families of nitrogenous bases: purines and pyrimidines; they are bound to the pentose with a glycosidic bond Pyrimidines made of a 6-member ring of carbon and nitrogen → cytosine, thymine, uracil (in RNA) Purines made of a 5-member ring fused onto a pyrimidine ring → Adenine and Guanine
Polynucleotides: nucleic acids are polynucleotides, in which the nucleotide monomers are joined via dehydration Synthesis. The phosphate of one nucleotide is joined by a phosphodiester linkage to the sugar of the next nucleotide resulting in a repeating sugar-phosphate-sugar-phosphate backbone While the backbone repeats itself without variation, the nitrogenous bases vary from one nucleotide to the next Double Helix DNA molecule consists of 2 polynucleotide chains which spiral around an axis to form a double helix The 2 sugar-phosphate backbones are on the outside, with the nitrogenous bases paired inside The nitrogenous bases are bound together by weak hydrogen bonds; there are so many that the structure is held intact
Strand Orientation The carbons of each sugar ring are numbered 1' to 5' (read one prime to 5 prime The backbones are oriented in opposite directions, which is called antiparallel. Therefore, if one strand has a 5'-3' orientation, the other strand must have a 3' - 5' orientation Many proteins that work on the DNA molecule are designed to begin at one specific side
Base Pairing and Coding Cytosine - Guanine and Adenine - Thymine If one sequence in chain is known, other can be deduced allows precise copying of genetic code Although BPs must be together, there is no limit of sequences, countless easy of arranging BPS
DNA Replication
Daughter cells receive an exact copy of the mother cell's DNA. Three models were suggested: Dispersive, Conservative, Semiconservative Dispersive replication → produce 2 copies of the DNA, both containing distinct regions of DNA composed of either both original strands or both new strands Conservative replication→ leave the two original template DNA strands together in a double helix and would produce a copy composed of 2 of two new strands containing all of the new DNA base pairs Semiconservative replication→ produce two copies that each contain one of the original strands and one new strand
Watson and Crick suggested that two strands would separate during replication as templates for assembling complementary strands In the late 1950s, Meselson and Stahl proved this Parent gen = all heavy, First gen = all intermediate, Second gen = half intermediate + half light
Meselson-Stahl Experiment
The actual process is rather complicated; the structure of the molecules must be untwisted and the antiparallel strands must be copied at the same time Happens very quickly -Nucleotides added at a rate of 50/second in mammals and 500/second In bacteria. Despite this speed, the accuracy is very high, with an error rate of in a billion
Focus on Replication DNA replication occurs in 3 phases: Initiation → Elongation → Termination Initiation Replication begins at a distinct location along the DNA molecule Proteins that are required to begin replication bind to these locations called the origins of replication Organisms with small amounts of DNA (bacteria, viruses) may have just a single replication origin whereas huge DNA molecules in eukaryotic cells may have thousands. DNA replication may be ongoing at many sites simultaneously Replication forks: places on molecule where DNA is being copied
Strand separation: 2 types of proteins involved Helicase: unwinds the helix, single-strand binding proteins keep separated strands apart Single strand binding proteins: when strands separated, molecule is physically stressed; protect strands Topoisomerases: enzymes that are used to untangle the effects of strand separation and relieve stress on molecule
Elongation Priming - nucleotides must be added to an existing polynucleotide chain Primase an enzyme that helps make a single-stranded segment of RNA that complements the template strand (original strand) It is later converted into DNA when the hydroxyl group is removed from its sugar in a subsequent step
Making the new DNA strands Once the primer is in place, another group of enzymes called DNA polymerases go to work. These bond new additions to the growing strand in a 5' to 3' direction. New nucleotides are always added to the 3' end. The energy to create more and more nucleotides comes from the nucleotides themselves: The incoming nucleotide is a triphosphate, and a high-energy pyrophosphate is released. The energy released is used to create covalent bonds in the phosphodiester linkage
Leading and lagging strands leading strand: DNA is synthesized as one long and continuous polymer lagging strand: Produced as a series of shorter segments, and each of the shorter segments is produced in the 5' to 3' direction. The segments are called Okazaki fragments To make these fragments into a continuous strand of DNA, a couple of proteins are needed. First, one of the DNA polymerases transforms the RNA primer into DNA. DNA ligase: linking enzyme combines the 3' end of the fragment to the 5' end of the growing chain.
DNA replication fork
Proofreading » I in 100k error rate While the overall error rate in DNA replication is around 1 in a billion, it is not solely achieved by accurate base-pairing Error rate in base pairing is closer to in 10,000 The polymerase checks the newly added nucleotide against the template strand If the pairing is incorrect, the polymerase backs up, removes the incorrect nucleotide, and replaces it before continuing
DNA Repair DNA molecules are commonly subjected to many dangers (reactive chemicals, radioactive emissions, x-rays, or light) resulting in as many as 1 million mutations/cell each day Direct reversal: Correcting a change caused performing the opposite process that caused mutation initially UV light causes a type of change that can sometimes be corrected by a photoreactivation enzyme (photolyase) Single Strand damage: Can be performed when damage to only one strand, generally make use of the undamaged strand for the correcting process Double strand breaks: When both strands in the double helix are broken; lead to rearrangement of genes Danger is higher if it occurs before replication for the next mitosis as there is no template Sometimes, the broken ends can simply be reattached At other times, the existing homologous chromosome can be used to guide the process
Translesion Synthesis - Process allows the cell to overlook the damaged parts of DNA
If the DNA can't be fixed? If a cell has a large amount of DNA damage, or it can no longer effectively repair the damage, can enter one of three possible states:
- Permanent dormancy -> health decline
- Apoptosis (programmed cell death)
- unregulated cell division, which can lead to the formation of a tumour Gene Expression I - Focus on Transcription
Central Dogma: Chain of command DNA → RNA → Protein Genes make instructions in the form of RNA, which is then used to program protein synthesis
Transcription and Translation Transcription: the info is being transcribed from RNA to DNA but in same language (nucleic acid) Translation: languages of nucleic acids and amino acids are different, process of decoding RNA into protein is translating
Nucleic to Amino Nucleic acid “words must be at least 3 bases long Codon: 3-nucleotide sequence of DNA or MRNA that specifies a particular amino acid Can also serve as a termination signal
Transcription Process DNA and RNA are read 3 → 5, constructed 5→ 3 RNA nucleotides added one at a time following the same base-pairing rules (W/ uracil replacing thymine) The process is performed by the enzyme DNA polymerase RNA polymerase starts transcribing at promoter sequence, and stops at terminator sequence
Initiation, Elongation and Termination Initiation: RNA polymerase binds to the promoter sequence and begins to unwind the DNA After it transcribes a certain # of base-pairs, elongation begins 5’ cap added to front of eukaryotic mRNA shortly after start of transcription, protects them from RNases Elongation: as RNA forms, it peels away from the DNA sequence allowing hydrogen bonds to reform between complementary DNA strands Termination: at terminator sequence, RNA polymerase releases the RNA/departs from the gene The new RNA molecule in eukaryotic cell goes through RNA processing At 3’ end a polyAtail is added between 100 and 250 bases long Poly A tail makes it more stable, allows mature mRNA to be exported from nucleus SIngle gene may be transcribed by many RNA polymerases at once, useful for quick reproduction
Gene Expression II - Translation
Translation: amino acids added 1 by 1 to growing polypeptide chain
Transfer RNA: responsible for bringing appropriate amino acid to th mRNA and growing complex To perform this task, 2 functions take place
- Pick up right amino acid
- Recognize the right codon for that amino acid on the mRNA sequence
tRNA molecule is about 80 bases long, contains a # of complementary regions that fold upon itself, forming double stranded regions At one end there is an anticodon: a sequence of 3 bases complementary to the mRNA codon At the other, the 3’ end is where amino acid can bind 3D shape is approximately L-shaped Correct amino acid is added by an enzyme aminoacyl trna synthetase; a different one for each amino acid The active site of this enzyme joins tnra to the amino acid w/ energy from ATP
Ribosomes Coordinate process of binding tRNA to the sequence of mrna codons Made of proteins and rRNA Has a binding site for mrna and 2 for trna P site: holds the trna carrying the growing polypeptide chain A site: holds the trna carrying the next amino acid
Protein synthesis The process requires the use of a # of enzymes, as well as phosphate energy that is provided by GTP (which is very similar to ATP) Initiation This requires GTP energy and a # of proteins called initiation factors, The process determines the exact starting location, which is assisted by the existence of the 5'cap. If the starting location is off even a nucleotide in either direction, the primary structure will be almost entirely different The first step is for an mRNA, small ribosomal subunit, and the first tRNA to be bound together. The first codon is usually AUG, and as such the first amino acid is usually methionine The next step is the binding of the larger ribosomal subunit, at which point the ribosome is complete and functional Elongation
- The next amino acid carried by appropriate trna moves to A site
- Peptidyl transferase catalyzes the formation of a peptide bond between the amino acid and growing chain Polypeptide detaches from the trna in the p site and is transferred to the trna molecule in the A site
- Translocation - trna in p site detaches and trna in A site is translocated to p site Codon-anticodon bond remains intact so mrna moves as well The next codon to be translated moves to the A site Termination Reaches termination codon A protein called release factor binds to the A site at the termination codon, and causes peptidyl transferase to hydrolyze the polypeptide chain from the tRNA A protein called release factor binds A site At this point, the two ribosome subunits, mRNA and tRNA all dissociate. In prokaryotes, protein synthesis can occur anywhere in the cell. In eukaryotes, the location depends on the final destination of the protein. Membrane proteins are made in ribosomes that are bound to the endoplasmic reticulum, white cytoplasm proteins are mache by free ribosomes It is not uncommon for several ribosomes… to be working on a single mKNA at a time. Clusters are called polyribosomes
Central Dogma Wobble Phenomenon: Although there are 64 possible codon combinations, there are only about 40 types of tRNA wobble: a violation of normal base pairing rule. Sometimes, the base in the third position of the anticodon can pair with more than one type of base in the mRNA.
- For example, if uracil is in the third position at a tRNA, it can pair with adenine or guanine.
UNIVERSALITY OF THE GENETIC CODE Most of the genetic code is shared by all organisms. Bacteria can translate human DNA, and vice versa. However, there are some exceptions to this. Some involve a rare species that translates a stop codon into an amino acid All other exceptions have to do with mitochondria, which hove their own DNA and their own machinery Introns and Exons There are large segments of the DNA molecules that do not code for protein. Additionally, it was found that non coding segments of DNA were actually falling within boundaries or genes The coding regions are exons (because they are expressed) The noncoding regions are introns ( intervening sequence) Example → Consider the following quote from Yoda: Wars not make one great. Now, written the way DNA is written → WARSNOTMAKEONEGREAT Now, written introns → WARSJSJDFPPTYNOTBVDMAKEXLKPVCONENWDWVGREAT. At the end of transcription, part of RNA processing involves the removal of introns, this is sometimes called RNA splicing, so WARSJSJDFPPTYNOTBVDMAKEXLKPVCONENWDWVGREAT would become WARSNOTMAKEONEGREAT JSJDFPPTY BVD XLKPVC NWDWV
Particles called snRNPs are important to this process Several different snRNPS make up the spliceosome, which interacted w/ the ends of the introns It recognizes the ends of the introns and releases them from the mrna molecule The exons are then joined by spliceosome Importance of introns Some introns code for proteins, others can be further processes after splicing to generate noncoding RNA molecules Control of Gene Expression Control at the DNA level DNA methylation - addition of methyl groups to bases of DNA after DNA synthesis involved in long-term control of gene expression and important to cell differentiation
- For example, once a cell specialises and becomes a stomach cell, this process helps the cell to "remember" that it is a stomach cell without weeding some kind of constant signal as a reminder
Control at transcription level RNA Processing At the end of transcription, RNA must be processed before it can function. Different patterns of splicing can cause RNA to function in different ways, giving the cell more flexibility in how it expresses genes. Regulating RNA degradation The lifetime of an mRNA molecule is an important factor in controlling gene expression. In bacteria, the lifetime is short. Mrna can live for hours/mins or even weeks in eukaryotic cells
Control at translation level There are many proteins that are factors in beginning and performing a translation. As such, there are many ways to control gene expression at the level of translation. The chemistry of the cellular environment often defennines whether translation is happening and at what rate. Can also be controlled after translocation takes place Gene expression in prokaryotes: operons TRP operon In bacteria, there is sometimes a need to synthesise the amino acid tryptophan. There are five proteins necessary and the genes for them are all in the same sequence of ONA
- The group of genes is called the trp operon RNA polymerase transcribes these genes either all at once or not at all. The long sequence of resulting mrna is punctuated" with START and STOP sequences. A single promoter serves all the genes of the operon A major advantage of this is that all of the proteins that are coded for in the oeron are made at the same time The expression of the operon is controlled by a segment of DNA called the operator. overlaps with both the promoter sequence and the first gene. It determines whether RNA polymerase can proceed with transcription. Transcription is stopped when a repressor binds to the operator region. The binding of the repressor physically prevents the RNA polymerase from attaching to the DNA promoter sequence
- Genes that code for the repressor are some distance away from the operon, and are called regulatory genes.
In the case of the trp operon, the repressor must be bound to tryptophan in order to bind to the operator sequence.
- If there are low levels of tryptophan in the environment, the repressor will be inactive and the trp operon will be expressed, allowing the cell to create its own tryptophan.
- If there are sufficient levels of tryptophan, then the repressor will bind with tryptophan. Once this happens, it can bind with the operator site and turn off gene expression of the trp operon
- This can sound backwards: Tryptophan activates the repressor which then turns off expression. Inactive repressors allow expression.
The repressor is always present, but not always active To be active, it needs to bind to tryptophan, which acts as a corepressor. Only the repressor-corepressor complex can block expression. These enzymes are an example of… repressible enzymes, because their production is inhibited by what they produce
Lac operon This operon codes for the enzyme pathway for breaking down lactose by bacteria in our digestive tracts. If the bacteria are in an environment w/ low levels of lactose, these enzymes will be present in very low numbers. When lactose is introduced, 1000s of these enzymes are produced within minutes In this case, the repressor is bound to the DNA by default When lactose is present, it binds to the repressor and makes it inactive, thereby detaching from the DNA and allowing RNA polymerase to proceed As such, lactose acts as an inducer, promoting the production of enzymes for its own breakdown Mutations A mutation is a change in a cell’s genomic sequence of DNA. point mutation: When a mutation is limited to a single base pair alteration
- Base-pair substitutions This is the replacement of one nucleotide and its partner w/ another pair of nucleotides.
- They can have no change: This would be called a silent mutation. At times the mutation results in a different codon that still codes for the same amino acid
- An insignificant change - The different amino acid may have no effect on the function of the protein
- A crucial change; wobble phenomenon This kind of mutation is most studied by geneticists because they are far easier to detect. They often involve a change in the amino acid sequence in a crucial location (eg. at the active site of an enzyme). Sometimes, these changes will actually result in a more effective protein, confer a selective advantage on the organism → can drive an evolutionary change More commonly the change will create an inactive protein + possibly endanger the cell Missense mutation: Although there is a change in the code it still codes for an amino acid In rare cases, there is a nonsense mutation in which the codon is changed to a stop codon. Nearly all nonsense mutations result in a non-functioning protein.
Base-pair insertions or deletions These mutations involve the insertion or deletion of one or more nucleotide pairs Tend to have far more disastrous effects on the resulting proteins than substitutions. This is because mRNA is read as a series of triplets during translation By adding or subtracting base pairs, the reading frame of the translating ribosome is changed frameshift mutations: happen anytime the number of nucleotides inserted or deleted is not a multiple of 3 *All codons downstream from the insertion/deletion will not be grouped properly → extensive missense or nonsense
Mutagenesis: the creation of mutations Some occur during the regular process of DNA replication, repair or recombination Error is catted spontaneous mutation Agents that cause mutations are called mutagens. The most common physical mutagen is nature and labs is radiation Chemical mutagens fall into several categories. one is the base pair analogue, in which a chemical is structurally similar to a normal DNA base, but pairs incorrectly.
Biotechnology Crash Course Restriction endonucleases (a.k.a. restriction enzymes) Cut DNA at places that have a specific code, called ** recognition sites: Different restriction enzymes have different recognition sites. ** restriction fragments: DNA is cut Into pieces ** There are over 2500 different restriction enzymes, with over 200 recognition sites
Sticky vs. Blunt Blunt ends are a clean cut, whereas sticky ends are more like a jagged cut. They are sticky because the place where hydrogen bonding occurs is exposed, making it easier to re-anneal to complementary fragments hydrogen bonding makes them sticky to complementary sequences
Nomenclature They are named after… the strain of bacteria from which they were taken ** For example, EcoRII ("eco R two") was the 2nd restriction enzyme taken from the escherichia coll (E. coli) bacteria
Methyl Groups In nature, dna is methylated in order to prevent transcription Useful in specialized cells → prevents transcribing genes that are not neededPrevents other molecules from binding In biotechnology, used to prevent cutting by restriction enzymes, narrow down research DNA ligase In biotechnology, DNA ligase is useful because it attaches DNA fragments to one another Restriction enzymes are used to cut the DNA into fragments which leave behind sticky and blunt ends. While these fragments may recombine due to hydrogen bonding, the recombinations are not secure until they are sealed by DNA ligase Plasmids Circular strand of DNA found in bacteria. They are expressed and replicated separately from chromosomal DNA DNA fragments can be inserted in plasmids and added to bacteria cells Cells then express genes on the plasmids Plasmids are vectors: used as vehicles for transferring foreign dna to cells Competent cells: cells capable of hosting foreign dna
PCR Method used to make huge numbers of DNA molecules from a small sample Dna is placed into a solution that includes individual deoxynucleoside triphosphates and a specialised dna polymerase enzyme that adds them to a growing polynucleotide chain
- Denaturation The solution Is heated to about 94°C, causing strands to separate; lasts 20 seconds to 1 minute
- Annealing: The solution is cooled to about 60°-65°C, to allow single stranded DNA primers to anneal to the template strand - 20-45 seconds
- Elongation: Solution is heated to 72 degrees C and polymerase adds individual dNTPs to be added to the primers. 2 mins One cycle of these three phases will double the amount of DNA in the sample PCR tech depends on ability of specialized dna polymerase to work at high temps
Gel electrophoresis It is a method of separating fragments of DNA based on how long they are solution of DNA is added to one end of a rectangular slab of agarose gel Electric current is passed through the gel, the fragments of DNA are negatively charged, so they are repelled by the current and move away from the source of electrical energy The rate of which they more is based on how the the fragment is longer fragments move slower, shorter move faster
Restriction Fragment Length Polymorphism It is a technique that allows for comparisons between DNA samples. This is useful in genetic sequencing, assessing genetic diversity, and forensics Sample of dna cut by restriction enzymes and run on electrophoresis gel To make sense of the smear it is heated and transferred to a different surface New surface treated w/ radioactive labels for particular sequences of dna Will bind on their targets which occur in distinct places Image can be taken and compared
DNA sequencing (sanger sequencing) Method of finding the sequence of nucleotides in a particular segment of dna Dna to be sequence is treated so that it becomes single-stranded Then its added to each of the 4 solutions, all 4 have normal dNTPs as well as dna polymerase Each solution contains a small amount of an abnormal version of dntp called ddntp The result is a # of dna fragments of diff lengths Solutions are added to an eletroph gel and run Resulting gel can be read from longest to shortest, revealing the sequence of nucleotides in the particular fragment