DNA and the Molecular Structure of Chromosomes: Exhaustive Study Notes

Functions of the Genetic Material and the Role of Chromosomes

The genetic material of an organism must satisfy three primary biological functions to ensure survival and continuity. The Genotypic Function refers to the capacity for replication, allowing genetic information to be passed from one generation to the next. The Phenotypic Function involves gene expression, where the genetic code controls the growth and development of the organism. Lastly, the Evolutionary Function allows for mutation or gene modifications, which enable the organism to adapt to environmental changes over time.

Genes are physically located on chromosomes, which are complex structures containing nucleic acids and associated proteins. The nucleic acids found in these structures are deoxyribonucleic acid ( $DNA$ ) and ribonucleic acid ( $RNA$ ). While $DNA$ is the primary reservoir of genetic information in most living organisms, certain viruses utilize $RNA$ as their genetic material. This genetic material is typically concentrated in the nuclear fraction (as $DNA$ ), while the cytosol contains $RNA$ and various proteins.

Experimental Evidence for DNA as the Genetic Material

Several landmark experiments provided the proof that $DNA$ serves as the carrier of genetic information. Frederick Griffith’s 1928 experiment (in vivo) used Streptococcus pneumoniae and mice. He observed two strains: type $IIIS$ (virulent) and type $IIR$ (avirulent). Griffith discovered that heat-killed $S$ cells could transform live $R$ cells into virulent $S$ cells, suggesting a "transforming principle." Later, Sia and Dawson conducted an in vitro version of this experiment using the same bacteria, proving that the transformation was an intrinsic property of the bacteria and did not require a host mouse to occur.

In 1944, Avery, MacLeod, and McCarty built upon these findings by showing that the "transforming principle" was indeed $DNA$ . Their in vitro experiments on Streptococcus pneumoniae provided the first strong evidence that $DNA$ , rather than protein, was responsible for bacterial transformation. Following this, the Hershey-Chase experiment (1952) used the bacteriophage $T2$ (a bacterial virus) and bacteria. They demonstrated that only the $DNA$ of the virus enters the bacterial cell to direct the production of new phages, confirming $DNA$ is the genetic material of the virus. Finally, Fraenkel-Conrat’s experiment on Tobacco Mosaic Virus ( $TMV$ ) demonstrated that in certain plant viruses, $RNA$ acts as the transforming principle that dictates the traits of new virus particles, even when paired with foreign proteins.

Chemical Structure of DNA and RNA

$DNA$ and $RNA$ are nucleic acids composed of repeating subunits known as nucleotides. Each nucleotide consists of a phosphate group ( $PO_4^{3-}$ ), a five-carbon pentose sugar, and a cyclic nitrogen-containing base. In $RNA$ , the sugar is ribose, which possesses a hydroxyl ( $OH$ ) group at the 2' position. In $DNA$ , the sugar is $2\text{-Deoxyribose}$ , which lacks this 2' hydroxyl group. The nitrogenous bases are categorized as Purines or Pyrimidines. Purines include Adenine ( $A$ ) and Guanine ( $G$ ), which are found in both $DNA$ and $RNA$ . Pyrimidines include Cytosine ( $C$ ), found in both, Thymine ( $T$ ), found primarily in $DNA$ , and Uracil ( $U$ ), found in $RNA$ in place of Thymine.

Nucleotides are named based on their components: in $DNA$ , these include deoxythymidine monophosphate ( $dTMP$ ), deoxycytidine monophosphate ( $dCMP$ ), deoxyadenosine monophosphate ( $dAMP$ ), and deoxyguanosine monophosphate ( $dGMP$ ). In $RNA$ , they are uridine monophosphate ( $UMP$ ), cytidine monophosphate ( $CMP$ ), adenosine monophosphate ( $AMP$ ), and guanosine monophosphate ( $GMP$ ). These nucleotides are linked by phosphodiester bonds to form a polynucleotide chain.

The DNA Double Helix and Chargaff’s Rules

The structure of $DNA$ was elucidated by James Watson, Francis Crick, Rosalind Franklin, and Maurice Wilkins using data from X-ray diffraction patterns. The molecule is a double helix with two strands running in opposite directions, known as antiparallel orientation (one $5' \rightarrow 3'$ , the other $3' \rightarrow 5'$ ). The sugar-phosphate backbones are on the exterior, and the nitrogenous bases are paired on the interior. The helix is right-handed, and the predominant form in nature is $B\text{-DNA}$ , which features approximately $10\,bp$ (base pairs) per turn, a distance of $3.4\,nm$ per helical turn, and $0.34\,nm$ between individual base pairs.

Erwin Chargaff established two primary rules for $DNA$ composition. Chargaff’s First Rule states that there is a regularity in base pairing where $A$ pairs with $T$ and $G$ pairs with $C$ . His Second Rule highlights that the percentages of these bases ( $A \approx T$ and $G \approx C$ ) are valid for each of the two strands. In all double-stranded $DNA$ samples, the molar ratio of $A/T$ and $G/C$ is close to $1.00$ , which is a universal characteristic. However, the ratio of $(A+G)/(T+C)$ varies between species (e.g., $0.39$ in Micrococcus lysodeikticus versus $1.53$ in humans), making it species-specific rather than universal.

Chemical Bonds and Helical Variations

$DNA$ structure is maintained by three types of chemical bonds. Covalent bonds are strong bonds formed by electron sharing between atoms; these are found in the sugars, bases, and the phosphodiester linkages involving the $5'\text{C}$ and $3'\text{C}$ of deoxyribose. Hydrogen bonds are weak interactions between electronegative atoms and electropositive hydrogen atoms; $A$ and $T$ are held by $2$ hydrogen bonds, while $G$ and $C$ are held by $3$ hydrogen bonds. Hydrophobic "bonds" involve the association of nonpolar groups (stacked base pairs) in an aqueous environment, forming a hydrophobic core.

$DNA$ can exist in multiple helical forms besides $B\text{-DNA}$ . $A\text{-DNA}$ is a right-handed, dehydrated form with $11\,bp/turn$ and deeper grooves. $Z\text{-DNA}$ is a left-handed, zigzag structure with $12\,bp/turn$ often found in $GC\text{-rich}$ regions and involved in gene regulation. There are also multistranded forms: Triple-stranded (Triplex) $DNA$ involves a third strand fitting into the major groove via Hoogsteen hydrogen bonds, while Quadruple-stranded (Quadruplex) $DNA$ forms from $G\text{-rich}$ or $C\text{-rich}$ (I-motif) sequences. These alternative structures are significant for processes like transcription, telomere replication, and DNA repair.

Topological Properties: Coiling and Supercoiling

Closed circular $DNA$ exhibits coiling, which is defined by the linking number ( $Lk$ ). The formula for the linking number is $Lk = Tw + Wr$ . Twist ( $Tw$ ) represents the number of helical turns (approximately $10.5\,bp/turn$ ). Writhe ( $Wr$ ) represents the number of supercoils (the turns the helix makes around itself). Relaxed $DNA$ has no supercoils ( $Wr = 0$ ), thus $Lk = Tw$ .

Supercoiling occurs when the Linking Difference ( $\Delta Lk = Lk - Lk_0$ ) is not zero. Negative Supercoiling ( $Lk < Lk_0$ ) indicates that the $DNA$ is underwound, which is the common state in biological cells as it facilitates the unwinding needed for enzymatic activity. Positive Supercoiling ( $Lk > Lk_0$ ) means the $DNA$ is overwound and is less common.

Organization of Prokaryotic and Eukaryotic Chromosomes

Prokaryotic chromosomes, such as those in E. coli, are typically single, circular, double-stranded molecules located in the nucleoid. They are organized into $50$ to $100$ negatively supercoiled loops or domains. This compaction is regulated by enzymes: DNA Topoisomerase I and Topoisomerase II ( $DNA\,gyrase$ , which is $ATP\text{-dependent}$ ). When lysed under mild conditions (low salt, polyamines), the chromosome is released as a compact, "folded" structure or isolated nucleoid.

Eukaryotic chromosomes contain giant molecules of $DNA$ that are highly condensed. Each chromosome contains a single large double helix (the unineme theory). The material of these chromosomes, chromatin, is a complex of $DNA$ , histones ( $H1$ , $H2a$ , $H2b$ , $H3$ , $H4$ ), and non-histone proteins. The fundamental unit of chromatin is the nucleosome, consisting of $146\,bp$ of $DNA$ wrapped around a histone octamer (two each of $H2a, H2b, H3$ , and $H4$ ). When the linker histone $H1$ is included, it involves a $166\text{-nucleotide-pair}$ length of $DNA$ .

Hierarchical Levels of DNA Compaction

$DNA$ packaging in eukaryotes occurs in stages: from the $2\,nm$ double helix to $11\,nm$ nucleosomes ("beads-on-a-string"), then into a $30\,nm$ chromatin fiber. The path of the $30\,nm$ fiber is modeled by either the Solenoid model (one-start helical stack) or the Zigzag model (two-start helical arrangement with straight linker $DNA$ ). This fiber is further organized into radial loop domains ( $25,000$ to $200,000\,bp$ ) anchored to a non-histone protein scaffold at Matrix-attachment regions ( $MARs$ ) or Scaffold-associated regions ( $SARs$ ). During interphase, chromatin exists as Euchromatin (less condensed, active) or Heterochromatin (highly condensed, inactive). Heterochromatin is further divided into Constitutive (permanently inactive, e.g., telomeres) and Facultative (interconvertible, e.g., Barr bodies). During metaphase, compaction reaches its peak ( $1,400\,nm$ thickness).

Repeated DNA Sequences and Genomic Significance

Genomes contain repetitive $DNA$ segments, making up approximately $50\%$ of the human genome. Tandem Repeats are adjacent segments including Short Tandem Repeats ( $STRs$ , $1\text{-}6\,bp$ ), Variable Number Tandem Repeats ( $VNTRs$ , $7\text{-}100+\,bp$ ), and Satellite $DNA$ (large segments in centromeres and telomeres). Interspersed Repeats are scattered throughout the genome and include Transposable Elements (jumping genes). These sequences are vital for genomic structure, evolution, and forensic medicine (e.g., $DNA$ fingerprinting).

Telomeres and Centromeres

Telomeres are the protective caps at the ends of linear chromosomes characterized by repetitive sequences. In humans, they consist of thousands of hexanucleotide repeats of $TTAGGG$ . Telomeres protect against degradation and the loss of coding information during the "end replication problem." As somatic cells divide, telomeres shorten, acting as a biological clock. The enzyme telomerase can add these repeats to extend cell life in stem and cancer cells. Structurally, telomeres feature a $G\text{-rich}$ overhang that loops back to form a $T\text{-loop}$ (or $O\text{-loop}$ ) and is protected by the Shelterin protein complex (including $TRF1$ , $TRF2$ , $POT1$ , $TIN2$ , and $TPP1$ ).

Centromeres are constricted regions essential for chromosome segregation. They are composed of large arrays of satellite $DNA$ , specifically a $171\,bp$ tandem repeat called alpha-satellite $DNA$ in humans. The centromere serves as the attachment point for spindle fibers via the assembly of the kinetochore. Satellite $DNA$ is isolated using gradient ultracentrifugation, where it separates based on buoyant density (average $DNA$ density is $1.56\,g/cm^3$ ).

Unineme versus Bineme Theory

Historically, two theories competed to explain chromosome structure. The Unineme theory (Single $DNA$ Molecule) posits that an unreplicated chromosome contains exactly one $DNA$ double helix. This is supported by autoradiographic segregation patterns and is the modern accepted view. The Bineme theory proposed that a chromosome consists of two homologous $DNA$ molecules linked at the telomere to form a circle. While the bineme model addressed certain cytological observations of chromatids, it conflicted with the evidence of semi-conservative $DNA$ segregation.