Nucleic Acids, Gene Expression, and Cell Division

Nucleic Acids: Storage, Transmission, and Expression of Heredity

Nucleic acids serve the vital function of storing, transmitting, and helping to express hereditary information. There are two primary types of nucleic acids: Deoxyribonucleic Acid (DNA) and Ribonucleic Acid (RNA). DNA contains four types of nitrogenous bases which are categorized into two structural groups: purines and pyrimidines. The bases in DNA are Adenine (A), Guanine (G), Cytosine (C), and Thymine (T). Purines, which include Adenine and Guanine, are larger molecules characterized by a double-ring structure featuring two fused rings. A common mnemonic to remember the purines is "Pure As Gold" (Purine: Adenine, Guanine). These purines are connected to Carbon $9'$ of the sugar molecule. Pyrimidines are smaller molecules with a single-ring structure; in DNA, these are Cytosine and Thymine. RNA is distinct because it contains Uracil (U) instead of Thymine (T) and is typically single-stranded, whereas DNA consists of two complementary anti-parallel strands that are negatively charged due to the phosphate groups.

Components and Chemical Structure of Nucleic Acids

Nucleic acids are polymers known as polynucleotides, while their individual monomers are called nucleotides. A single nucleotide is composed of three essential parts: a pentose sugar, a nitrogenous base, and a phosphate group. The pentose sugar varies between the two types of nucleic acids: Ribose (found in RNA) has an $-OH$ group on Carbon $2'$ , whereas $2$ -Deoxyribose (found in DNA) has only an $-H$ atom on Carbon $2'$ . The sugar-phosphate backbone of a nucleic acid is hydrophilic and formed by nucleotides linked together by phosphodiester bonds. This structure creates a repeating $\text{sugar-phosphate-sugar-phosphate}$ sequence along the strand. The nucleotide polymer is synthesized in a specific direction, typically referenced by the Carbon positions of the sugar (e.g., $5'$ and $3'$ ).

Chargaff's Rule and the Discovery of the Double Helix

In $1950$ , Erwin Chargaff formulated Chargaff's Rule regarding DNA composition. He stated that base composition differs between species. Furthermore, he observed that in any given species, the amount of Adenine is equal to Thymine ( $A = T$ ) and the amount of Cytosine is equal to Guanine ( $C = G$ ). Consequently, the sum of all bases must equal $100\%$ . The physical structure of DNA was elucidated in $1953$ by James Watson and Francis Crick, who utilized X-ray diffraction images produced by Rosalind Franklin. Franklin's X-ray image (Photo $51$ ) was created by shining X-rays through DNA strands, revealing an "X" shape that indicated DNA's three-dimensional structure is a double helix.

Nucleoside and Nucleotide Formation

A nucleoside is formed by the combination of a nitrogenous base and a pentose sugar. A nucleotide is a nucleoside with an attached phosphate group (Base + Sugar + Phosphate). The linkage between the sugar and the base is a covalent bond known as a glycosidic bond. This bond forms through a condensation reaction (dehydration synthesis) where the $-OH$ group on the sugar's Carbon $1'$ and a Hydrogen atom from the base combine to form a water molecule ( $H_2O$ ). Once the water is removed, the covalent glycosidic bond is established. In the double helix, bases pair specifically via hydrogen bonds: Guanine pairs with Cytosine, and Adenine pairs with Thymine.

DNA as the Transforming Genetic Material: Historical Research

Historically, genetic material was believed to be composed of either DNA or protein. The discovery of DNA's genetic role began in $1928$ with research by Frederick Griffith using the bacterium Streptococcus pneumoniae. Griffith studied two strains: the S-strain (smooth), which possesses an outer capsule that protects it, making it virulent and lethal to mice; and the R-strain (rough), which lacks a capsule and is non-virulent. His process involved four injections: live S-strain (mouse dies), live R-strain (mouse lives), heat-killed S-strain (mouse lives as the strain denatures), and a mixture of heat-killed S-strain plus live R-strain (mouse dies). Griffith concluded that something from the dead S-type bacteria transformed the live R-type bacteria into virulent S-type bacteria, a phenomenon termed the "Transforming Principle." Later research confirmed that the transforming substance was DNA.

The Hershey and Chase Experiment: Confirming DNA as Genetic Material

In $1952$ , Alfred Hershey and Martha Chase conducted an experiment to definitively determine if DNA or protein was the genetic material. They used bacteriophages (specifically T2), viruses that infect E. coli bacteria. Bacteriophages have a simple structure consisting of DNA and a protein coat. DNA contains Phosphorus but no Sulfur, while proteins contain Sulfur but no Phosphorus. They labeled one batch of phages with radioactive Sulfur- $35$ ( $^{35}S$ ) to track proteins and another batch with Phosphorus- $32$ ( $^{32}P$ ) to track DNA. The phages were allowed to infect the bacteria. The mixtures were then agitated in a blender to knock the phage coats off the bacteria and centrifuged to separate components based on density. The heavier bacteria formed a pellet at the bottom. Radiation from $^{32}P$ was found inside the bacteria (pellet), while radiation from $^{35}S$ remained in the liquid (supernatant), proving that DNA is the material that enters the cell and carries genetic information.

DNA Replication Models and the Meselson-Stahl Experiment

Watson and Crick proposed three potential models for DNA replication: the conservative model, the semiconservative model, and the dispersive model. In $1958$ , Matthew Meselson and Franklin Stahl performed an experiment that proved the semiconservative model is correct. They grew E. coli in a medium containing heavy Nitrogen ( $^{15}N$ ) until all DNA contained the heavy isotope. They then transferred the bacteria to a medium with light Nitrogen ( $^{14}N$ ). After the first round of replication, the DNA was centrifuged and formed a single band of medium density (hybrid DNA consisting of one $^{15}N$ strand and one $^{14}N$ strand). This result ruled out the conservative model and supported the semiconservative model, where each new DNA molecule consists of one original (parental) strand and one newly synthesized strand.

The Molecular Mechanism of DNA Replication in E. coli

In E. coli, DNA is circular and found in the cytoplasm. Replication begins at a specific site called the origin of replication and proceeds bidirectionally, meaning the replication forks move in opposite directions. This allows the bacteria to copy its DNA before binary fission. Helicase is the enzyme responsible for unzipping the DNA by breaking hydrogen bonds, which creates the replication fork. As the DNA unwinds, single-strand binding proteins (SSBPs) attach to the separated strands to stabilize them and prevent them from re-annealing. Topoisomerase acts ahead of the replication fork to relieve the tension and overwinding (supercoiling) generated by the unwinding of the helix.

Enzymes and Proteins Involved in DNA Replication

DNA synthesis requires a primer because DNA polymerase cannot initiate a new strand from scratch. Primase synthesizes a short RNA primer (approximately $10$ nucleotides long). DNA Polymerase III then adds DNA nucleotides to the $3'$ end of the primer, synthesizing the new strand in the $5' \rightarrow 3'$ direction. The leading strand is synthesized continuously toward the replication fork. The lagging strand is synthesized discontinuously away from the fork in short segments called Okazaki fragments. DNA Polymerase I later removes the RNA primers and replaces them with DNA nucleotides. Finally, DNA Ligase joins the fragments together by forming the final phosphodiester bonds, acting like a glue for the DNA backbone.

Telomeres and the Protection of Linear Chromosomes

Eukaryotic cells possess linear chromosomes, which present a challenge for DNA replication at the ends. Telomeres are found at the ends of these linear chromosomes and consist of non-coding DNA. They act as a "buffer zone" to protect the essential genetic information of the chromosome from damage or loss. Because DNA polymerase cannot complete the very ends of the lagging strand, telomeres become shorter after every round of replication. If telomeres were absent, the cell would gradually lose necessary genetic information. The enzyme telomerase is used to extend and maintain telomeres, though it is not active in all cell types.

Gene Expression: The Central Dogma

Gene expression describes how inherited traits are determined by genes. A gene is defined as a sequence of nucleotides along a DNA strand that codes for a specific protein. The flow of genetic information inside a cell is explained by the Central Dogma: DNA is transcribed into mRNA, and mRNA is translated into protein. During transcription, the DNA template strand (running $3' \rightarrow 5'$ ) is used to create a complementary RNA strand (running $5' \rightarrow 3'$ ). The coding strand (non-template) of DNA matches the sequence of the mRNA produced, except that RNA contains Uracil in place of Thymine.

Bacterial Transcription: Initiation, Elongation, and Termination

Transcription in bacteria involves three main stages. In Initiation, the enzyme RNA polymerase binds to the DNA at a specific sequence called the promoter. The DNA unwinds and separates at this site. In Elongation, RNA polymerase moves along the $3' \rightarrow 5'$ DNA template strand and synthesizes the RNA transcript in the $5' \rightarrow 3'$ direction. Only one strand of DNA is used as a template. Finally, in Termination, the RNA polymerase reaches a terminator sequence, causing the enzyme to dissociate from the DNA and release the completed mRNA transcript. The structure of a bacterial gene includes the promoter (where transcription starts), the RNA-coding sequence, and the terminator (the end of mRNA creation).

Eukaryotic Transcription and RNA Processing

Eukaryotes utilize multiple types of RNA polymerase; specifically, RNA polymerase II is used to synthesize pre-mRNA. Initiation in eukaryotes requires transcription factors—proteins that help RNA polymerase II bind to the promoter, often at a region called the TATA box. DNA packaging also plays a role in gene expression; DNA is wrapped around proteins called histones to form nucleosomes. The N-terminal tails of these histones can be modified; for instance, adding acetyl groups (acetylation) loosens the chromatin (euchromatin) to increase transcription, while tightly packed DNA (heterochromatin) usually results in genes not being expressed. In Eukaryotic Termination, the RNA is cut free. Eukaryotic pre-mRNA must undergo three major modifications (RNA processing) before leaving the nucleus: adding a $5'$ cap (a modified Guanine) to protect against degradation and aid ribosome binding; adding a Poly-A tail (multiple Adenines) to the $3'$ end for protection and export; and RNA splicing. Splicing involves spliceosomes (complexes of proteins and small RNA molecules called ribozymes) that remove non-coding introns and join together the coding exons. Alternative splicing allows a single gene to produce different mature mRNAs by joining exons in various combinations.

Translation: mRNA to Protein Synthesis

Translation is the process where the sequence of bases in mRNA is used to synthesize proteins. This process occurs at the ribosome, which is composed of ribosomal RNA (rRNA) and proteins. Ribosomes are made in the nucleolus (in eukaryotes) or found in the cytoplasm (in prokaryotes). A ribosome consists of two subunits: the small subunit and the large subunit, which only assemble in the presence of mRNA. The ribosome reads the mRNA bases in groups of three called codons. A polyribosome refers to the phenomenon where multiple ribosomes translate the same mRNA molecule simultaneously, increasing the rate of protein production.

The Structure and Role of the Ribosome

The ribosome contains an mRNA binding site on the small subunit and three specific binding sites for transfer RNA (tRNA) on the large subunit: the P-site (peptidyl-tRNA binding site), which holds the tRNA carrying the growing polypeptide chain; the A-site (aminoacyl-tRNA binding site), which holds the tRNA carrying the next amino acid to be added; and the E-site (exit site), where discharged tRNAs leave the ribosome. There is also an exit tunnel through which the completed polypeptide chain emerges.

The Mechanism of Translation: Initiation, Elongation, and Termination

Translation requires various protein factors and energy (provided by GTP and ATP). Initiation begins when the mRNA binds to the small ribosomal subunit near the start codon (AUG). A tRNA carrying the anticodon (UAC) and the amino acid methionine enters the P-site. The large subunit then binds to form the full initiation complex. Elongation involves three steps: a new tRNA enters the A-site, a peptide bond forms between the existing polypeptide chain and the new amino acid, and translocation occurs, where the ribosome moves one codon forward along the mRNA. Termination occurs when the ribosome reaches a stop codon (UAA, UAG, or UGA). A release factor enters the A-site, triggering the release of the finished polypeptide and the dissociation of the ribosomal subunits.

Wobbling and the Genetic Code

There are fewer tRNA molecules than there are codons. This is possible because of "wobbling," a phenomenon where the third base of a codon has flexible pairing requirements. While the first two bases of a codon-anticodon pair must strictly follow base-pairing rules, the third position is less stringent, allowing one tRNA to recognize more than one codon.

Mutations: Variations in the Genetic Sequence

A mutation is a change in the DNA sequence that can create genetic variety. Mutations are categorized as large-scale (affecting large chromosome regions through deletions or inversions) or small-scale (affecting a few nucleotides). Small-scale point mutations include substitutions and indels. Substitutions can be silent (the codon changes but the amino acid remains the same), missense (the amino acid changes), or nonsense (a codon is changed into a stop codon). Indels (insertions and deletions) often result in frameshift mutations. Causes of mutations include replication errors by DNA polymerase and external mutagens like UV radiation, smoking, or industrial chemicals. A notable example is Sickle-cell disease, caused by a single nucleotide mutation in the gene encoding the $\beta$ -globin polypeptide of hemoglobin. This results in sickle-shaped red blood cells, causing reduced oxygen levels and fatigue.

Chromosomes: Structure, Karyotypes, and Genomes

The genome comprises all the DNA in a cell, while a chromosome is a single DNA molecule within that genome. During cell division, DNA replicates and condenses into structures consisting of two sister chromatids joined at a centromere. Chromosomes have a short arm ( $p$ arm) and a long arm ( $q$ arm), with telomeres at the tips. They can be classified by shape based on the position of the centromere: metacentric, sub-metacentric, acrocentric, or telocentric. A karyotype is an ordered display of chromosome pairs, grouped by size and length, usually captured during metaphase when they are most condensed. Humans have $23$ pairs: $22$ pairs of autosomes and $1$ pair of sex chromosomes. Homologous chromosomes in a pair carry genes for the same characters at the same locus, though they may have different alleles (variants of a gene). Polyploidy, common in plants, occurs when an organism has more than two complete sets of chromosomes ( $3n, 4n$ , etc.).

Mitosis and the Eukaryotic Cell Cycle

Mitosis is the division of the nucleus that produces two genetically identical daughter cells, used for growth and repair. It is distinct from meiosis, which is for gamete production. The cell cycle includes Interphase and the Mitotic (M) phase. Interphase consists of $G_1$ (cell growth and organelle production), S phase (DNA replication, where chromatid number doubles while chromosome number remains the same), and $G_2$ (preparation for mitosis). The M phase includes Mitosis and Cytokinesis. Mitosis stages are: Prophase (chromosomes condense, nucleolus disappears, spindle fibers form); Prometaphase (spindle fibers attach to the kinetochore protein on chromosomes); Metaphase (chromosomes line up at the metaphase plate); Anaphase (cohesins are broken by the enzyme separase, and sister chromatids separate); and Telophase (nuclear envelopes form and nucleoli reappear). Cytokinesis follows, where a contractile ring pinches the cell at the cleavage furrow to create two separate cells.