Principles of Genetics

Introduction to Genetics
  • Genetics is the study of heredity and the variation of inherited characteristics. It involves understanding how traits are passed from parents to offspring. It integrates several fields including molecular biology (study of biological activity at a molecular level), biochemistry (study of chemical processes within living organisms), and population genetics (study of genetic variation within populations).

Central Dogma of Molecular Biology
  • The central dogma describes the fundamental flow of genetic information:

    • DNA Replication: The process by which DNA helicase unwinds the double helix, and DNA polymerases synthesize new complementary strands, resulting in two identical DNA molecules from one parent molecule. This ensures genetic information is accurately passed to daughter cells.

    • Transcription: The process where genetic information from a DNA template is copied into a messenger RNA (mRNA) molecule by RNA polymerase. This occurs in the nucleus of eukaryotes and the cytoplasm of prokaryotes.

    • Translation: The biochemical process where the genetic code carried by mRNA is decoded by ribosomes to synthesize specific proteins. Transfer RNA (tRNA) molecules carry specific amino acids to the ribosome according to the mRNA codons.

What is DNA?
  • DNA (Deoxyribonucleic Acid): A complex molecule that contains all the genetic instructions necessary for the growth, development, functioning, and reproduction of living organisms and many viruses. It serves as the blueprint for life.

  • Main role: Long-term storage of information that contains the instructions for making all the proteins needed for an organism's life. This information is encoded in the sequence of nitrogenous bases.

  • Basic Structure: DNA is a polymer made up of monomers called nucleotides. Each nucleotide consists of a deoxyribose sugar (C5H10O_4), a phosphate group, and a nitrogenous base. The 'deoxy' in deoxyribose refers to the absence of a hydroxyl group at the 2' carbon position, which distinguishes it from ribose in RNA.

DNA Structure
  • A nucleotide is the basic monomeric unit of DNA, composed of three parts:

    • Phosphate Group: A negatively charged molecule that links the 3' carbon of one sugar to the 5' carbon of the next, forming the sugar-phosphate backbone.

    • Sugar:

      • Deoxyribose in DNA: A five-carbon sugar. The carbons are numbered 1' through 5'. The base attaches to the 1' carbon and the phosphate to the 5' carbon.

      • Ribose in RNA: Also a five-carbon sugar, but with a hydroxyl group at the 2' carbon.

    • Nitrogenous Base: An organic molecule containing nitrogen, responsible for carrying the genetic code.

  • Four nitrogenous bases in DNA:

    • Adenine (A): A purine, with a double-ring structure.

    • Guanine (G): A purine, also with a double-ring structure.

    • Cytosine (C): A pyrimidine, with a single-ring structure.

    • Thymine (T): A pyrimidine, also with a single-ring structure. (In RNA, Thymine is replaced by Uracil (U), which is also a pyrimidine).

  • Base pairing:

    • Adenine (A) always pairs with Thymine (T) via two hydrogen bonds (A=T).

    • Guanine (G) always pairs with Cytosine (C) via three hydrogen bonds (G\equiv C).

    • This specific pairing is complementary and crucial for DNA replication and repair.

    • Structures include pyrimidines (Cytosine, Thymine, Uracil) which are single-ring structures, and purines (Adenine, Guanine) which are double-ring structures. A purine always pairs with a pyrimidine, maintaining a consistent diameter of the DNA double helix.

Primary Structure of DNA
  • The primary structure of DNA refers to the linear sequence of nucleotides joined by phosphodiester bonds.

  • Sequences are traditionally written and read from the 5' (five-prime) end to the 3' (three-prime) end, indicating the direction of synthesis or reading. The 5' end has a free phosphate group attached to the 5' carbon of the deoxyribose, while the 3' end has a free hydroxyl group attached to the 3' carbon.

  • DNA strands are antiparallel orientation: One strand runs 5' to 3', and its complementary strand runs 3' to 5'. This arrangement is critical for DNA replication and transcription.

  • Phosphodiester linkages: These are strong covalent bonds that connect the 3' hydroxyl group of one deoxyribose sugar to the 5' phosphate group of the next deoxyribose sugar, forming the sugar-phosphate backbone of each strand.

Watson-Crick Model (1953)
  • DNA is a right-handed double helix, resembling a twisted ladder, with a uniform diameter of about 2 nm.

  • Each chain consists of linear nucleotide sequences joined by phosphodiester bonds, forming the backbone. The nitrogenous bases project inward, forming the "rungs" of the ladder.

  • The two strands are held together by hydrogen bonds between complementary base pairs (A-T and G-C).

  • The double helix has major and minor grooves on its surface, which serve as binding sites for DNA-binding proteins involved in gene expression and regulation.

  • Key Contributors: James Watson and Francis Crick proposed the double helix model. Their work was significantly informed by the X-ray diffraction data produced by Rosalind Franklin and Maurice Wilkins, which provided crucial evidence for the helical structure and dimensions of DNA.

DNA Replication Steps
  • DNA replication is a highly regulated and accurate process that occurs during the S-phase of the cell cycle in eukaryotes.

  1. Initiation at the Origin of Replication (ORI): Replication begins at specific DNA sequences called origins of replication. These are AT-rich regions, which are easier to unwind due to fewer hydrogen bonds in A-T pairs.

  2. Unwinding by Helicase: An enzyme called DNA helicase unwinds and separates the two DNA strands by breaking the hydrogen bonds between complementary base pairs, creating a replication bubble. Single-strand binding proteins (SSBs) then bind to the separated strands to prevent them from reannealing.

  3. Formation of Replication Fork: As helicase unwinds the DNA, a Y-shaped structure known as the replication fork is formed. Prokaryotes typically have one origin (e.g., oriC in E. coli), while eukaryotes have multiple origins. Topoisomerases (like gyrase) relieve the torsional stress ahead of the replication fork caused by unwinding.

  4. RNA Primers Synthesized by Primase: DNA polymerase cannot start a new strand from scratch; it requires an existing 3'-OH group. An enzyme called primase synthesizes short RNA primers (5-10 nucleotides long) that provide this initial 3'-OH group. These primers are essential for initiating DNA synthesis.

  5. DNA Synthesis by DNA Polymerase III: DNA polymerase III (in prokaryotes, other polymerases in eukaryotes) binds to the primer and synthesizes a new complementary DNA strand by adding deoxyribonucleotides. It reads the template strand in the 3' to 5' direction and synthesizes the new strand in the 5' to 3' direction. This occurs bidirectionally from each origin.

  6. Elongation:

    • Leading Strand: This strand is synthesized continuously in the 5' to 3' direction, moving toward the replication fork, requiring only one primer.

    • Lagging Strand: This strand is synthesized discontinuously in the 5' to 3' direction, away from the replication fork, creating short fragments called Okazaki fragments. Each Okazaki fragment requires its own RNA primer.

  7. Termination: Replication terminates when replication forks meet or at specific termination sequences (e.g., ter sites in bacteria). In eukaryotes, telomeres, repetitive DNA sequences at the ends of chromosomes, also play a role in termination and protecting chromosome ends.

DNA Replication and Synthesis Overview
  • Eukaryotic DNA replication involves multiple origins of replication evenly distributed along each chromosome to ensure timely and efficient replication of larger genomes. It also utilizes a variety of DNA polymerases, each with specialized roles (e.g., DNA Pol \alpha for initiation, Pol \delta and Pol \epsilon for elongation).

  • Each turn of the DNA double helix is made up of approximately 10.4 nucleotide pairs (base pairs) and spans a length of about 3.4 nm.

  • The average distance between adjacent nucleotide pairs (the stacking distance) along the helical axis is approximately 0.34 nm. This regular structure contributes to the stability of the DNA molecule.

Semi-Conservative Replication
  • The Meselson-Stahl experiment (1958) provided crucial evidence for the semi-conservative model of DNA replication. They used nitrogen isotopes (^{14}N and ^{15}N) to label the DNA of E. Coli.

    • If replication were conservative, the original heavy DNA would remain intact, and all new DNA would be light.

    • If replication were dispersive, all DNA molecules would be a mixture of heavy and light.

    • However, after one generation in ^{14}N medium, they observed DNA of an intermediate density, indicating one old ^{15}N strand and one new ^{14}N strand in each molecule (semi-conservative). After a second generation, both intermediate and light DNA were observed, further confirming the semi-conservative model.

  • Three postulated methods of DNA replication:

    1. Semi-Conservative (confirmed): Each new DNA molecule consists of one original (parental) strand and one newly synthesized strand. This is the correct model for all known organisms.

    2. Conservative (not significant): Proposed that the original DNA molecule would remain entirely intact and a completely new DNA molecule would be synthesized.

    3. Dispersive (not significant): Suggested that both new DNA molecules would be a mixture of old and new DNA interspersed along each strand in segments.

Summary of DNA Replication
  • The overall process involves precise enzymatic coordination:

    • Unwinding: Helicase separates the strands.

    • Initiation: Primase lays down RNA primers.

    • Elongation: DNA polymerase synthesizes new DNA.

    • Termination: Forks meet, and telomeres are managed.

  • Leading strand: Synthesized continuously in the 5' to 3' direction towards the replication fork by DNA polymerase III.

  • Lagging strand: Synthesized discontinuously in the 5' to 3' direction, away from the replication fork, forming short Okazaki fragments. Each fragment is synthesized after a new primer is laid down.

  • RNA primer removal and ligation: In prokaryotes, DNA polymerase I removes the RNA primers and fills the gaps with DNA nucleotides. In eukaryotes, RNase H removes the RNA primers. Finally, DNA ligase forms phosphodiester bonds to join the Okazaki fragments (and any other nicks in the backbone) into a continuous strand, completing the synthesis.

Types of DNA Repair
  • DNA is constantly susceptible to damage from various sources. Cells have evolved multiple repair mechanisms to maintain genomic integrity.

  • Endogenous Damage Types: Damage originating from within the cell.

    1. Oxidation of bases and strand interruptions: Reactive oxygen species (ROS) can chemically modify bases (e.g., an 8-oxo-guanine lesion) or cause single-strand breaks.

    2. Alkylation of bases: Addition of alkyl groups (e.g., methyl groups via S-adenosylmethionine (SAM)) to bases, which can mispair or block replication.

    3. Hydrolysis of bases: Spontaneous chemical reactions like deamination (e.g., conversion of cytosine to uracil) or depurination/depyrimidination (loss of a base from the sugar-phosphate backbone), leading to abasic sites.

    4. Bulky adduct formation: Covalent binding of large chemical groups to DNA bases, often distorting the helix and blocking replication/transcription.

    5. Base mismatches during replication: Errors made by DNA polymerase where an incorrect base is incorporated and not immediately corrected by proofreading.

  • Exogenous Agents can also damage DNA: Damage originating from external sources.

    • UV light: Causes pyrimidine dimers (e.g., thymine dimers), where two adjacent pyrimidine bases on the same strand become covalently linked, distorting the helix.

    • Ionizing radiation (e.g., X-rays, gamma rays): Generates free radicals that can cause single-strand breaks, double-strand breaks (DSBs), and base damage, which are highly deleterious.

    • Chemicals: Various mutagens and carcinogens (e.g., intercalating agents, base analogs) can cause different types of DNA damage.

DNA Recombination Methods
  • Recombination: The process by which pieces of DNA are broken and recombined to produce new combinations of alleles. This exchange of genetic information between similar DNA molecules (homologous chromosomes) is vital for genetic diversity (e.g., during meiosis) and for repairing certain types of DNA damage. The primary mechanism is homologous recombination (HR).

  • Double-strand break (DSB) repair pathways: DSBs are particularly dangerous lesions. Cells have two main pathways to repair them:

    • Non-Homologous End Joining (NHEJ): A "lick-and-stick" repair mechanism that ligates the broken DNA ends directly, often resulting in small insertions or deletions. It is error-prone but quick and efficient, especially in G1 phase.

    • Homologous Recombination (HR): An accurate repair mechanism that uses a homologous DNA template (e.g., sister chromatid after replication) to guide the repair of the break, minimizing loss of genetic information. It is active primarily in S and G2 phases when a sister chromatid is available.

Experimental Evidence for DNA as Genetic Material
  • The Hershey-Chase experiment (1952) definitively demonstrated that DNA, not protein, is the genetic material. They used bacteriophages (viruses that infect bacteria).

    • They labeled the DNA of one batch of phages with radioactive phosphorus (^{32}P), as DNA contains phosphorus but protein does not.

    • They labeled the protein coat of another batch of phages with radioactive sulfur (^{35}S), as protein contains sulfur but DNA does not.

    • When these phages infected bacteria, they found that ^{32}P (DNA) entered the bacteria and was passed on to the next generation of phages, while most of the ^{35}S (protein) remained outside the bacterial cells. This showed that DNA carries the genetic instructions for viral replication.

Gene Expression and Regulation Overview
  • Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product, such as a protein or non-coding RNA.

  • Gene expression is not constant; genes aren’t always “on.” Its level can vary dramatically depending on the cell type, developmental stage, and environmental conditions.

  • Regulation of gene expression is crucial for cell differentiation (determining cell type, e.g., muscle cell vs. nerve cell) and allowing organisms to adapt to internal and external environmental signals. This complex control ensures that only necessary genes are expressed at appropriate times and levels.

Levels of Gene Regulation
  • Gene expression can be regulated at multiple stages:

  1. Transcriptional Control:

    • Promoters: DNA sequences located upstream of a gene that act as binding sites for RNA polymerase.

    • Transcription Factors (TFs): Proteins that bind to specific DNA sequences (promoters, enhancers) to either activate (activators) or repress (repressors) gene transcription.

    • Epigenetic modifications: Changes to DNA (e.g., DNA methylation) or histones (e.g., acetylation) that alter chromatin structure and accessibility to transcription machinery, without changing the underlying DNA sequence.

  2. Post-Transcriptional Control: Occurs after RNA is transcribed from DNA.

    • mRNA processing: In eukaryotes, newly transcribed pre-mRNA undergoes splicing (removal of introns and joining of exons), 5' capping, and 3' polyadenylation to become mature mRNA. Alternative splicing allows one gene to produce multiple protein isoforms.

    • mRNA stability: The lifespan of an mRNA molecule in the cytoplasm, regulated by factors like microRNAs (miRNAs) or RNA-binding proteins, determines how much protein can be translated from it.

    • mRNA transport: The export of mRNA from the nucleus to the cytoplasm is regulated.

  3. Translational Control: Regulates how efficiently mRNA is used for protein production by ribosomes.

    • Involves factors like initiation factors, repressors that bind to mRNA, and the availability of tRNAs. For example, specific sequences in the 5' untranslated region (UTR) of mRNA can influence translation efficiency.

  4. Post-Translational Control: Occurs after a protein has been synthesized.

    • Protein modifications: Includes processes like phosphorylation, glycosylation, ubiquitination, or cleavage, which can alter protein activity, localization, or stability.

    • Protein degradation: Controlled breakdown of proteins (e.g., via the proteasome system after ubiquitination) to regulate their abundance and function.

CRISPR-Cas9 Genome Editing
  • The CRISPR-Cas9 system is a revolutionary gene-editing tool derived from a bacterial adaptive immune system. It allows for highly precise and efficient targeted modification of genomic DNA.

  • Mechanism:

    • It uses a guide RNA (gRNA) molecule that is complementary to a specific target DNA sequence.

    • The gRNA guides the Cas9 enzyme (a nuclease) to the corresponding DNA location.

    • Cas9 then creates a double-stranded break (DSB) at that precise site.

    • The cell's natural DNA repair mechanisms are then exploited:

      • NHEJ (Non-Homologous End Joining): Often leads to small insertions or deletions (indels) that can disrupt a gene (knockout).

      • HR (Homologous Recombination): If a donor DNA template (with desired sequence changes) is provided, HR can incorporate the new sequence, allowing for precise gene editing (knock-in, correction).

  • Applications: Wide-ranging, including fundamental research, developing gene therapies for genetic diseases (e.g., sickle cell anemia, cystic fibrosis), creating disease-resistant crops, and improving traits in livestock.

PCR Overview
  • Polymerase Chain Reaction (PCR) is a powerful molecular biology technique used to make millions to billions of copies of a specific DNA segment, allowing scientists to amplify a single or a few copies of a DNA sequence across several orders of magnitude.

  • The process relies on thermal cycling, which involves repeated cycles of heating and cooling steps:

    1. Denaturation (~94-98^\circ C): High heat separates the double-stranded DNA template into two single strands.

    2. Annealing (~50-65^\circ C): The temperature is lowered, allowing short, synthetic DNA primers (oligonucleotides) to bind (anneal) to their complementary sequences on the single-stranded DNA templates.

    3. Extension (~70-75^\circ C): A heat-stable DNA polymerase (most commonly Taq polymerase, isolated from Thermus aquaticus) synthesizes new DNA strands by extending the primers in the 5' to 3' direction, using the single-stranded DNA as a template and dNTPs (deoxyribonucleotide triphosphates) as building blocks.

  • Key components: DNA template, two primers, Taq polymerase, and dNTPs.

  • Applications: PCR has revolutionized various fields, including:

    • Genotyping: Identifying genetic variations.

    • Gene expression analysis: Quantifying mRNA levels (quantitative PCR or RT-PCR).

    • Forensics: DNA fingerprinting from trace samples.

    • Diagnosis of infectious diseases: Detecting viral or bacterial DNA/RNA.

    • Gene cloning: Amplifying target genes for further manipulation.

    • Genetic testing: Detecting mutations associated with inherited diseases.