Human Genome Project and Genetic Research

Overview of the Human Genome Project

  • Established in 1988 by the NIH with James Watson as the director, and later led by Francis Collins. It was an international collaborative research program.
  • Officially began on October 1, 1990, and was completed in 2003 (two years ahead of its original schedule), achieving completion under the estimated budget.

Goals of the Human Genome Project

  1. Obtain a genetic linkage map of the human genome. This involved mapping the relative locations of genetic markers along chromosomes to help localize disease genes.
  2. Develop a physical map of the human genome. This aimed to provide landmarks across chromosomes, indicating the precise order and distance between DNA segments.
  3. Obtain the complete DNA sequence of the human genome. This was the primary goal, to determine the exact order of the 3 billion base pairs that make up human DNA.
  4. Develop technology for managing human genome information. This included creating databases and computational tools for storing, retrieving, and analyzing vast amounts of genomic data.
  5. Analyze genomes of model organisms. Studying organisms like E. coli, yeast, fruit fly, and mouse provided insights into gene function conserved across species.
  6. Address ethical, legal, and social implications of findings. This involved studying the societal impacts of genomics research, including privacy, discrimination, and the responsible use of genetic information.
  7. Develop technological advances in genetic methodologies. This pushed forward new techniques for DNA sequencing, mapping, and analysis.

Development of Linkage Maps

  • Began in the 1980s, focusing on gene location on chromosomes through genome mapping, which establishes genetic distances based on recombination frequencies.
  • Utilized marker sequences such as:
    • Restriction fragment length polymorphisms (RFLPs) or Restriction enzyme sites.
    • Repeat number variations like Variable Number Tandem Repeats (VNTRs) and Short Tandem Repeats (STRs).
  • Linkage mapping established the proximity between markers and disease alleles using:
    • Pedigree analysis: studying inheritance patterns within families to trace diseases.
    • Positional cloning: a strategy to identify disease genes based on their chromosomal location without prior knowledge of their function.
  • Early examples of disease-causing genes mapped and sequenced included:
    • Huntington’s disease.
    • Cystic fibrosis (CFTR gene).
    • BRCA1 breast cancer gene.
    • Retinoblastoma.
  • By the late 1980s, over 3,500 markers and genes had been mapped to human chromosomes.

Genome Sequencing Techniques

Shotgun Sequencing

  • A method extensively used in the early human genome sequencing project, often in a 'hierarchical' approach starting with larger clones.
  • May also be applicable in novel genomic sequencing.
  • The process:
    1. Randomly isolate clones from a genomic or chromosomal library and sequence them.
    2. Initially utilized Sanger (dideoxy) sequencing to obtain overlapping sequences, which provides high accuracy for individual reads.
    3. Use computer algorithms to align overlapping sequences, forming contigs (contiguous sequences).
Shotgun Sequencing Process
  1. Break DNA into random fragments of approximately 160 kb.
  2. Clone these fragments using Bacterial Artificial Chromosomes (BACs), which can hold large DNA inserts.
  3. Fragment the cloned 160 kb segments into smaller 1 kb segments.
  4. Clone the 1 kb sequences using plasmids, which are easier to work with for sequencing.
  5. Sequence each 1 kb cloned fragment, identifying overlapping ends.
  6. Reconstruct the 160 kb fragments by aligning matching ends of the 1 kb fragments, creating a