Recombinant DNA Technology and Genomics
Recombinant DNA Technology
T DNA
Definition: DNA containing the gene of interest.
Agrobacterium tumefaciens: A bacterium that facilitates the transfer of T DNA into plant cells using its Ti plasmid.
Procedure:
Treat foreign DNA.
Introduce the recombinant plasmid into cultured plant cells using restriction enzyme and DNA ligase.
Regenerate new plants from cultured cells.
New trait in plants
Plants exhibit new traits due to the inserted T DNA carrying the new gene.
Genomics
Definition of Genomics
Genomics: Study of whole genomes or genomes in their entirety.
Genome: The haploid set of chromosomes in a gamete or the complete set of genes or genetic material present in a cell of an organism.
Key areas of inquiry:
Structural genomics
Functional genomics
Comparative genomics
Introduction
Genomic Analysis
Genomic Analysis (Genomics): A rapidly advancing area of modern genetics that provides unprecedented information about the genomes of different organisms.
Bioinformatics: Uses mathematical software applications to:
Organize, share and analyze data related to gene structure, sequence, and expression, protein structure, and function.
Historical Perspective: Recombinant DNA
Recombinant DNA refers to the joining of DNA molecules from different biological sources not found together in nature.
Basic procedure:
Generate specific DNA fragments using restriction enzymes.
Join these fragments with a vector (carrier molecule).
Transfer the recombinant DNA molecule to a host cell to produce many copies that can be recovered from the host cell.
Restriction Enzyme Review
Restriction Enzymes
Definition: Restriction enzymes bind to DNA at a specific recognition sequence (restriction site) and cleave the DNA to produce restriction fragments.
Suicide Genes
Function: These genes kill bacteria if no insert is present, ensuring that only clones with inserts are visible as white colonies.
Restriction Maps Using Fluorescence
Methodologies
Isolate long DNA molecules like those from solid tissue, blood, and cell lines.
Fluorescently tag DNA at specific sequence motifs, generally averaging every 6 kb.
Pass DNA through nanochannels to linearize and separate single molecules.
Employ continuous cycling of DNA through a chip and imaging of single molecules, a technique known as "Optical Genome Mapping."
Bacterial Artificial Chromosomes (BACs)
Overview
BACs have an insert size capacity of 100โ200 kb and are preferred as cloning vectors.
cDNA Synthesis
Techniques for cDNA Construction
Stepwise process to synthesize cDNA from mRNA includes the usage of restriction enzymes like EcoRI and RNase H, and the integration of linkers containing EcoRI sites.
cDNA libraries represent genes expressed at higher levels proportionally more than genes expressed at low levels due to cloning biases.
Genomic vs. cDNA Libraries
Genomic Library: Contains all sequences - genic and intergenic, represented approximately equally.
cDNA Library: Contains only expressed genes, with frequencies reflecting gene expression levels in mature mRNA (excluding introns and intergenic sequences).
Genetic Engineering
Application Example
Synthetic Human Insulin: Initially produced in bacteria, this insulin regulates glucose metabolism. The two insulin subunits are produced as fusion polypeptides, purified, and cleaved to release active insulin which spontaneously unites to form the functional protein.
Transgenic Salmon
Injection of DNA into Salmon Eggs
Utilizes endogenous sockeye salmon growth hormone gene and incorporates enhancer elements responsive to environmental changes.
Outcome: Transgenic salmon exhibit enhanced growth rates while maintaining comparable ages to wild-type.
Sequencing Techniques
Overview
Old vs. New Libraries: Traditional sequencing methods like Sanger sequencing face limitations in efficiency, while next-generation sequencing allows for a much higher output and complexity in data.
.
Whole-Genome Sequencing (WGS)
WGS Overview
Also known as shotgun sequencing/cloning, and involves:
Cutting genomic DNA into a series of overlapping fragments using restriction enzymes.
Aligning overlapping fragments using computers to assemble chromosomes.
Contig: Created from aligned overlapping fragments, establishing a contiguous sequence across the chromosome.
Performance Metrics
Each flowcell lane in Illumina sequencing can generate 6 billion reads of 150-bp per run. Achieving at least 30X coverage for two human genomes.
Next-Generation Sequencing Advantages
With whole-genome shotgun approaches, DNA fragments are sequenced without cloning, which streamlines the process of genome assembly significantly.
Bioinformatics Applications
Key Applications
Algorithms for DNA-sequence alignment: Identifies overlapping sequences and reconstructs their order on the chromosome.
Contigs: Continuous fragments created from overlapping sequences.
Repetitive DNA in Mammalian Genomes
Composition
More than 50% of mammalian genomes comprise repetitive DNA elements:
LINES: Long interspersed elements (~6-7 kb), roughly 500,000 in the human genome (17% of the genome).
SINES: Short interspersed elements (~300-bp), approximately 1.6M in the human genome (13% of the genome).
Sequencing Challenges
Paired-end shotgun sequencing strategy is particularly advantageous for dealing with repetitive DNA, allowing for the construction of scaffolds linking contigs that include repetitive segments.
Library Preparation Overview
Steps in DNA Library Preparation: Include fragmentation, adapter addition, PCR amplification, sequencing, and data analysis, forming an essential workflow for modern genomic studies.
Next-Generation Sequencing Characteristics
Output and Quality Metrics
Run times, maximum output per run, read length, and data quality metrics such as Q30 values are critical for interpreting sequencing results across different instruments.
Long-Read Sequencing Technologies
Overview
3rd Generation Sequencing Technologies: Includes methods like PacBio and Oxford Nanopore that utilize different mechanisms for sequencing and allow for comprehensive variant detection, addressing gaps in genome coverage.
De Novo Assembly Versus Reference-Based Assembly
De novo assembly constructs genomes from scratch while reference-based assembly aligns sequences to existing genomes for resequencing, highlighting variations such as single nucleotide polymorphisms (SNPs) and larger scale variations.
Homologous Genes
Definition and Importance
Homologous genes are evolutionarily related genes, categorized into:
Orthologs: Genes from different species with a common ancestor, retaining the same function.
Paralogs: Genes within the same species derived from duplication, often taking on different functions.
Applications of Bioinformatics
Key Applications Include
Comparing DNA sequences, identifying genes, predicting amino acid sequences, and deducing evolutionary relationships.
Genome Annotation
Importance
Genome annotation relies on bioinformatics for identifying gene-regulatory sequences and functional elements, and is essential for mapping out genes within genomic sequences.
Functional Genomics
Overview
Functional genomics interprets DNA sequence to establish gene functions, directly relates to their encoded RNAs or proteins, and confirms computational predictions through experimental approaches.
Human Genome Project (HGP)
Overview
The HGP was a landmark project designed to sequence and identify all genes in the human genome, initiated in 1990 and completed with a budget of $3 billion.
Key Findings
Discovered that less than 2% of the genome codes for proteins and only about 20,000 protein-coding genes exist, contrary to early predictions.
Revealed the significance of alternative splicing, contributing to the diversity of proteins produced by fewer genes.
Accessing HGP Data
Importance of HGP Contributions
The HGP provided extensive maps for genes implicated in disease conditions and advanced the identification of disease-related genes.
Comparative Genomics
Definition
Comparative genomics examines genomes across different organisms to study aspects such as gene discovery, evolutionary relationships, and model organisms, also sequencing for over 23,000 genomes by 2018.
Importance of Domesticated Species
Domesticated species serve as important biomedical models for studying genetic diseases. Their genetic structures are often more amenable to GWAS (Genome-Wide Association Studies).
The Neanderthal Genome
Overview
The Neanderthal genome, sequenced from fossil samples, revealed insights into human evolution, including interbreeding events between Neanderthals and modern humans.
Findings
The Neanderthal genome is 99% identical to that of modern humans, with significant evolutionary implications indicating where human genotypes rapidly evolved post divergence.
Metagenomics and the Human Microbiome Project
Overview
Metagenomics examines the genomes of microbial communities in environmental samples, whereas the Human Microbiome Project analyzed the genomes of microorganisms that inhabit the human body.
Transcriptome Analysis
RNA Sequencing
RNA sequencing enables the in situ analysis of gene expression, providing comprehensive data for understanding expression variability and quantitative measurements of transcript levels.
Proteomics Overview
Definition
Proteomics involves the identification and characterization of all proteins, allowing for the comparison of protein profiles in various conditions and the identification of biomarkers for diseases.
Techniques and Technologies
Utilizes techniques such as two-dimensional gel electrophoresis and mass spectrometry for protein analysis, providing insights into protein structure, function, and interaction networks.