Recombinant DNA Technology and Genomics

Recombinant DNA Technology

T DNA
  • Definition: DNA containing the gene of interest.

  • Agrobacterium tumefaciens: A bacterium that facilitates the transfer of T DNA into plant cells using its Ti plasmid.

  • Procedure:

    1. Treat foreign DNA.

    2. Introduce the recombinant plasmid into cultured plant cells using restriction enzyme and DNA ligase.

    3. Regenerate new plants from cultured cells.

New trait in plants
  • Plants exhibit new traits due to the inserted T DNA carrying the new gene.

Genomics

Definition of Genomics
  • Genomics: Study of whole genomes or genomes in their entirety.

  • Genome: The haploid set of chromosomes in a gamete or the complete set of genes or genetic material present in a cell of an organism.

  • Key areas of inquiry:

    • Structural genomics

    • Functional genomics

    • Comparative genomics

Introduction

Genomic Analysis
  • Genomic Analysis (Genomics): A rapidly advancing area of modern genetics that provides unprecedented information about the genomes of different organisms.

  • Bioinformatics: Uses mathematical software applications to:

    • Organize, share and analyze data related to gene structure, sequence, and expression, protein structure, and function.

Historical Perspective: Recombinant DNA

  • Recombinant DNA refers to the joining of DNA molecules from different biological sources not found together in nature.

  • Basic procedure:

    1. Generate specific DNA fragments using restriction enzymes.

    2. Join these fragments with a vector (carrier molecule).

    3. Transfer the recombinant DNA molecule to a host cell to produce many copies that can be recovered from the host cell.

Restriction Enzyme Review

Restriction Enzymes
  • Definition: Restriction enzymes bind to DNA at a specific recognition sequence (restriction site) and cleave the DNA to produce restriction fragments.

Suicide Genes

  • Function: These genes kill bacteria if no insert is present, ensuring that only clones with inserts are visible as white colonies.

Restriction Maps Using Fluorescence

Methodologies
  1. Isolate long DNA molecules like those from solid tissue, blood, and cell lines.

  2. Fluorescently tag DNA at specific sequence motifs, generally averaging every 6 kb.

  3. Pass DNA through nanochannels to linearize and separate single molecules.

  4. Employ continuous cycling of DNA through a chip and imaging of single molecules, a technique known as "Optical Genome Mapping."

Bacterial Artificial Chromosomes (BACs)

Overview
  • BACs have an insert size capacity of 100โ€“200 kb and are preferred as cloning vectors.

cDNA Synthesis

Techniques for cDNA Construction
  • Stepwise process to synthesize cDNA from mRNA includes the usage of restriction enzymes like EcoRI and RNase H, and the integration of linkers containing EcoRI sites.

    • cDNA libraries represent genes expressed at higher levels proportionally more than genes expressed at low levels due to cloning biases.

Genomic vs. cDNA Libraries
  • Genomic Library: Contains all sequences - genic and intergenic, represented approximately equally.

  • cDNA Library: Contains only expressed genes, with frequencies reflecting gene expression levels in mature mRNA (excluding introns and intergenic sequences).

Genetic Engineering

Application Example
  • Synthetic Human Insulin: Initially produced in bacteria, this insulin regulates glucose metabolism. The two insulin subunits are produced as fusion polypeptides, purified, and cleaved to release active insulin which spontaneously unites to form the functional protein.

Transgenic Salmon

Injection of DNA into Salmon Eggs
  • Utilizes endogenous sockeye salmon growth hormone gene and incorporates enhancer elements responsive to environmental changes.

  • Outcome: Transgenic salmon exhibit enhanced growth rates while maintaining comparable ages to wild-type.

Sequencing Techniques

Overview
  • Old vs. New Libraries: Traditional sequencing methods like Sanger sequencing face limitations in efficiency, while next-generation sequencing allows for a much higher output and complexity in data.
    .

Whole-Genome Sequencing (WGS)

WGS Overview
  • Also known as shotgun sequencing/cloning, and involves:

    1. Cutting genomic DNA into a series of overlapping fragments using restriction enzymes.

    2. Aligning overlapping fragments using computers to assemble chromosomes.

    • Contig: Created from aligned overlapping fragments, establishing a contiguous sequence across the chromosome.

Performance Metrics
  • Each flowcell lane in Illumina sequencing can generate 6 billion reads of 150-bp per run. Achieving at least 30X coverage for two human genomes.

Next-Generation Sequencing Advantages
  • With whole-genome shotgun approaches, DNA fragments are sequenced without cloning, which streamlines the process of genome assembly significantly.

Bioinformatics Applications

Key Applications
  • Algorithms for DNA-sequence alignment: Identifies overlapping sequences and reconstructs their order on the chromosome.

    • Contigs: Continuous fragments created from overlapping sequences.

Repetitive DNA in Mammalian Genomes

Composition
  • More than 50% of mammalian genomes comprise repetitive DNA elements:

    • LINES: Long interspersed elements (~6-7 kb), roughly 500,000 in the human genome (17% of the genome).

    • SINES: Short interspersed elements (~300-bp), approximately 1.6M in the human genome (13% of the genome).

Sequencing Challenges
  • Paired-end shotgun sequencing strategy is particularly advantageous for dealing with repetitive DNA, allowing for the construction of scaffolds linking contigs that include repetitive segments.

Library Preparation Overview

  • Steps in DNA Library Preparation: Include fragmentation, adapter addition, PCR amplification, sequencing, and data analysis, forming an essential workflow for modern genomic studies.

Next-Generation Sequencing Characteristics

Output and Quality Metrics
  • Run times, maximum output per run, read length, and data quality metrics such as Q30 values are critical for interpreting sequencing results across different instruments.

Long-Read Sequencing Technologies

Overview
  • 3rd Generation Sequencing Technologies: Includes methods like PacBio and Oxford Nanopore that utilize different mechanisms for sequencing and allow for comprehensive variant detection, addressing gaps in genome coverage.

De Novo Assembly Versus Reference-Based Assembly

  • De novo assembly constructs genomes from scratch while reference-based assembly aligns sequences to existing genomes for resequencing, highlighting variations such as single nucleotide polymorphisms (SNPs) and larger scale variations.

Homologous Genes

Definition and Importance
  • Homologous genes are evolutionarily related genes, categorized into:

    • Orthologs: Genes from different species with a common ancestor, retaining the same function.

    • Paralogs: Genes within the same species derived from duplication, often taking on different functions.

Applications of Bioinformatics

Key Applications Include
  • Comparing DNA sequences, identifying genes, predicting amino acid sequences, and deducing evolutionary relationships.

Genome Annotation

Importance
  • Genome annotation relies on bioinformatics for identifying gene-regulatory sequences and functional elements, and is essential for mapping out genes within genomic sequences.

Functional Genomics

Overview
  • Functional genomics interprets DNA sequence to establish gene functions, directly relates to their encoded RNAs or proteins, and confirms computational predictions through experimental approaches.

Human Genome Project (HGP)

Overview

  • The HGP was a landmark project designed to sequence and identify all genes in the human genome, initiated in 1990 and completed with a budget of $3 billion.

Key Findings

  • Discovered that less than 2% of the genome codes for proteins and only about 20,000 protein-coding genes exist, contrary to early predictions.

  • Revealed the significance of alternative splicing, contributing to the diversity of proteins produced by fewer genes.

Accessing HGP Data

Importance of HGP Contributions
  • The HGP provided extensive maps for genes implicated in disease conditions and advanced the identification of disease-related genes.

Comparative Genomics

Definition
  • Comparative genomics examines genomes across different organisms to study aspects such as gene discovery, evolutionary relationships, and model organisms, also sequencing for over 23,000 genomes by 2018.

Importance of Domesticated Species
  • Domesticated species serve as important biomedical models for studying genetic diseases. Their genetic structures are often more amenable to GWAS (Genome-Wide Association Studies).

The Neanderthal Genome

Overview
  • The Neanderthal genome, sequenced from fossil samples, revealed insights into human evolution, including interbreeding events between Neanderthals and modern humans.

Findings
  • The Neanderthal genome is 99% identical to that of modern humans, with significant evolutionary implications indicating where human genotypes rapidly evolved post divergence.

Metagenomics and the Human Microbiome Project

Overview
  • Metagenomics examines the genomes of microbial communities in environmental samples, whereas the Human Microbiome Project analyzed the genomes of microorganisms that inhabit the human body.

Transcriptome Analysis

RNA Sequencing
  • RNA sequencing enables the in situ analysis of gene expression, providing comprehensive data for understanding expression variability and quantitative measurements of transcript levels.

Proteomics Overview

Definition
  • Proteomics involves the identification and characterization of all proteins, allowing for the comparison of protein profiles in various conditions and the identification of biomarkers for diseases.

Techniques and Technologies
  • Utilizes techniques such as two-dimensional gel electrophoresis and mass spectrometry for protein analysis, providing insights into protein structure, function, and interaction networks.