Human Genome Project and Genetic Research
Overview of the Human Genome Project
- Established in 1988 by the NIH with James Watson as the director, and later led by Francis Collins. It was an international collaborative research program.
- Officially began on October 1, 1990, and was completed in 2003 (two years ahead of its original schedule), achieving completion under the estimated budget.
Goals of the Human Genome Project
- Obtain a genetic linkage map of the human genome. This involved mapping the relative locations of genetic markers along chromosomes to help localize disease genes.
- Develop a physical map of the human genome. This aimed to provide landmarks across chromosomes, indicating the precise order and distance between DNA segments.
- Obtain the complete DNA sequence of the human genome. This was the primary goal, to determine the exact order of the 3 billion base pairs that make up human DNA.
- Develop technology for managing human genome information. This included creating databases and computational tools for storing, retrieving, and analyzing vast amounts of genomic data.
- Analyze genomes of model organisms. Studying organisms like E. coli, yeast, fruit fly, and mouse provided insights into gene function conserved across species.
- Address ethical, legal, and social implications of findings. This involved studying the societal impacts of genomics research, including privacy, discrimination, and the responsible use of genetic information.
- Develop technological advances in genetic methodologies. This pushed forward new techniques for DNA sequencing, mapping, and analysis.
Development of Linkage Maps
- Began in the 1980s, focusing on gene location on chromosomes through genome mapping, which establishes genetic distances based on recombination frequencies.
- Utilized marker sequences such as:
- Restriction fragment length polymorphisms (RFLPs) or Restriction enzyme sites.
- Repeat number variations like Variable Number Tandem Repeats (VNTRs) and Short Tandem Repeats (STRs).
- Linkage mapping established the proximity between markers and disease alleles using:
- Pedigree analysis: studying inheritance patterns within families to trace diseases.
- Positional cloning: a strategy to identify disease genes based on their chromosomal location without prior knowledge of their function.
- Early examples of disease-causing genes mapped and sequenced included:
- Huntington’s disease.
- Cystic fibrosis (CFTR gene).
- BRCA1 breast cancer gene.
- Retinoblastoma.
- By the late 1980s, over 3,500 markers and genes had been mapped to human chromosomes.
Genome Sequencing Techniques
Shotgun Sequencing
- A method extensively used in the early human genome sequencing project, often in a 'hierarchical' approach starting with larger clones.
- May also be applicable in novel genomic sequencing.
- The process:
- Randomly isolate clones from a genomic or chromosomal library and sequence them.
- Initially utilized Sanger (dideoxy) sequencing to obtain overlapping sequences, which provides high accuracy for individual reads.
- Use computer algorithms to align overlapping sequences, forming contigs (contiguous sequences).
Shotgun Sequencing Process
- Break DNA into random fragments of approximately 160 kb.
- Clone these fragments using Bacterial Artificial Chromosomes (BACs), which can hold large DNA inserts.
- Fragment the cloned 160 kb segments into smaller 1 kb segments.
- Clone the 1 kb sequences using plasmids, which are easier to work with for sequencing.
- Sequence each 1 kb cloned fragment, identifying overlapping ends.
- Reconstruct the 160 kb fragments by aligning matching ends of the 1 kb fragments, creating a