week 3 human genome project

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards

initial plan for HGP

  • Expect to be a 15-year initiative​

  • Gain experience with model organisms first before giving full attention to human genome​

  • In each case, map DNA first and then sequence DNA​

  • Wait to sequence human genome until a new ‘revolutionary’ DNA sequencing method(s) becomes available – replacing Sanger DNA sequencing​

  • Make generating the first sequence of the human genome the signature accomplishment of the HGP​

  • Ensure that this fundamental research benefitted all by championing ELSI​

2
New cards

ELSI

  • acronym for Ethical, Legal, and Social Issues ​

  • includes all non-technical issues that arise when developing emerging science and technologies and implementing them in society ​

  • T coined by James Watson, the director of HGI at NIH in 1988 ​

  • The ELSI Research Program was born in the Human Genome Project that started in the US in January 1990

3
New cards

ELSI priority areas

  1. ​Privacy and fairness in the use and interpretation of genetic information​

  2. Clinical integration of new genetic technologies ​

  3. Issues surrounding genetics research ​

  4. Public and professional education ​

4
New cards

1990 in genomics

  • scientific community had mixed opinions about HGP

  • No detailed start-to-finish plan for executing HGP(i.e., overt expectation to ‘figure it out along the way’)​​

  • Genomics was a ‘toddler’ field, growing up as a melting pot of scientific other disciplines​

  • Either no internet or painfully early days of a functional internet​

5
New cards

HGP organisms

  • Yeast: Used to study fundamental biological processes in a eukaryotic organism

  • Fruit Fly: A long-standing model for understanding genetics, especially related to human biological processes

  • Mouse: The most common animal model due to significant genetic and physiological similarities to humans (around 85% similar protein-coding regions), allowing for the study of human diseases and the discovery of new drugs

  • Zebrafish: A vertebrate model organism with a short life cycle, allowing for the study of processes like heart and blood vessel formation and development of complex structures like eyes and brains

  • Humans​​

6
New cards

clone based physical mapping

restriction enzymes break up parts of the chromosome

7
New cards

contigs

a continuous stretch of DNA sequence assembled from overlapping DNA fragments, such as short reads or cloned DNA segments.

overlapping pieces are aligned and merged to form a complete, gap-free sequence that represents a larger, contiguous region of the genome.

Contigs further organized into scaffolds to form a complet

8
New cards

cystic fibrosis gene mapping

gene identification for cystic fibrosis found in chromosome 7

9
New cards

subclone construction

a molecular biology technique to transfer a specific DNA fragment (the "insert") from one plasmid vector to another "destination" vector.

  1. isolating the insert, preparing the destination vector,

  2. joining the two DNA fragments through ligation

  3. inserting the resulting recombinant plasmid into bacterial cells

  4. screening for successful subclones.

10
New cards

shotgun sequencing

a laboratory technique for determining the DNA sequence of an organism’s genome (or part of the genome).

method involves randomly breaking up the DNA into small fragments that are then sequenced individually. A computer program looks for overlaps in the DNA sequences, using them to reassemble the fragments in their correct order to determine the sequence of the starting DNA.​​

11
New cards

clone-by-clone sequencing strategy

map of each chromosome of the genome is made before the DNA is split up into fragments. These chunks of DNA are inserted into Bacterial Artificial Chromosome (BAC) libraries and put inside bacterial cells to grow before sequencing.

12
New cards

sequence reads

the specific DNA sequence obtained from a single, short piece of DNA that is sequenced during a DNA sequencing experiment

These individual reads are then computationally assembled, like pieces of a puzzle, to reconstruct the original, complete DNA sequence​​

Sequence of base pairs: A read represents an ordered sequence of DNA's chemical building blocks, known as base pairs (adenine, guanine, cytosine, and thymine).

Fragment-based: Each read comes from a single, fragmented section of a larger DNA molecule.

13
New cards

first eukaryotic genomes sequenced by HGP

First DNA Genome (1977): The DNA genome of bacteriophage ϕX174, a small virus, was the first DNA genome sequenced by Frederick Sanger's team.

First Cellular Genome (1995): The bacterium Haemophilus influenzae was the first complete sequence of a cellular organism.

First Eukaryotic Genome (1996): Saccharomyces cerevisiae, or baker's yeast, was the first eukaryotic genome to be fully sequenced.

First Multicellular Genome (1998): The genomic sequence for the nematode C. elegans was announced in 1998, making it the first multicellular organism to have its genome sequenced.

The Drosophila melanogaster (fruit fly) genome was sequenced and published in March 2000

First Human Genome (2001): The world's first draft of the human genome was completed, a monumental map of human genetic​​

14
New cards

HGP donor genome

70% of one individual with blended ancestry

30% from 19 individuals mostly from european ancestry

mosaic representation

15
New cards

challenges of human genome sequencing

  • Human Genome: ~3,000,000,000 nucleotides(bases or base pairs)​

  • Sanger DNA sequencing Circa 1990: ~500-800bases per read​

  • ‘Coverage’ (i.e., number of time each base is read) needed to be high (e.g., >30-fold) to attain high accuracy​

  • Roughly half of human genome consists repetitive DNA, much of it reflecting remnants of transposable elements (difficult to read)

16
New cards

first human genome sequence

  • 6 Countries, 20 Centers, 1000’s of researchers​

  • ~1,000 bases/second, 24 hours/day, & 7 days/week for ~6 years​

  • Brute force using Sanger DNA sequencing and massive computational help

use of both clone by clone and whole genome shotgun sequencing

17
New cards

James Watson and Francis Collins

HGP-clone by clone shotgun sequnecing

Collins had a leading role in the HGP taking a strategic mapping-based approach having succeeded in cloning some major disease genes (CF, DMD) as a determined and elegant research scientist.

18
New cards

Celera Genomics (Craig Venter)

whole genome shotgun sequencing

Initially Venter worked within the HGP consortium, but he clashed with them over strategy and personality. There is no doubt his more high-tech and faster approach raised the game and pioneered some approaches that led to more rapid progress.

19
New cards

2022

a truly complete (‘telomere-to-telomere’) human genome sequence was finally generated

20
New cards

Bermuda Principles for Data Sharing

  • Significant attention to release and sharing of HGP genome sequence data​

  • Two seminal meetings in Bermuda in1 996 and 1997​

  • Landmark agreement for rapid data release and public access to HGP genome sequence data​

  • Became known as ‘Bermuda Principles’​

  • Among the most important legacy of HGP​

21
New cards

HGP output

  • Declared complete April 2003​

  • 3 billion base pairs (bp) (3164.7 million precisely)​

  • 1.1% exons, 24% introns, 75% intergenic DNA​

  • Approx. 3 million single nucleotide polymorphisms (SNP’s)​

  • Less than 1% of all SNP’s cause changes in proteins​

  • Approximately 20K genes​

  • The average gene consists of 3000 bases, but sizes vary greatly (Dystrophin gene is 2.4 million bases).​

  • Almost all (99.9%) bases are exactly the same in all people​

  • The functions are unknown for over 50% of discovered genes​

  • Chromosome 1 had the most genes (2968, now 4220), and the Y the fewest (231, now 693)​

  • Challenges: What We Still Don’t Know

22
New cards

reference genome

a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species.

They are assembled from the sequencing of DNA from a number of individual donors, not reflecting any individual at the genetic level.

Instead a reference provides a haploid mosaic of different DNA sequences from multiple donors. ​​

23
New cards

human pangenome

  • new human pangenome (2023)​

  • The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals​

  • Captures known variants and haplotypes and reveals new alleles at structurally complex loci​

  • Adds 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing referenceGRCh38​

first draft of the human pangenome reference.

These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.

Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci