LW

Genomics Fundamentals

27.1 Mapping the Human Genome

  • Genomics: The study of whole sets of genes and their functions.
  • Human genome consists of approximately 3 billion base pairs.
  • Average chromosome contains 130,400,000 base pairs (bp).
  • Human genome contains approximately 20,500 genes.
  • Average chromosome has about 890 genes.
  • Genetic Map: A physical representation of landmarks in a genome and their relative positions.
  • Initial genetic disease studies involved identifying landmarks co-inherited with the disease gene, providing information on chromosome location.
  • Early sequencing experiments could only process about 300 base pairs.
  • The Human Genome Project (HGP) was initiated in 1990 by NIH, involving 20 groups at not-for-profit institutes and universities.
  • In 1995, Celera Genomics, a commercial biotechnology company, began a separate effort to sequence the human genome.
  • Two different strategies were used:
    • The Human Genome Project created a series of maps of finer and finer resolution.
    • Celera fragmented DNA and then relied on instrumental and computer-driven techniques to establish the sequence.

Human Genome Project Strategy

  • A genetic map was generated, displaying the physical location of inherited, identifiable DNA sequences (markers).
  • The physical map refined the distance between markers to approximately 100,000 base pairs.
  • To achieve finer resolution, a chromosome was cut into large segments, and multiple copies (clones) of the segments were produced.
  • Overlapping clones, covering the entire chromosome length, were arranged to create the next level of map.
  • Each clone was then cut into 500 base-pair fragments, and the sequence of bases in each fragment was determined.
  • Finally, all 500 base-pair sequences were assembled into a completed nucleotide map of the chromosome.
  • Genetic map resolution: 1 Mb.
  • Physical map resolution: 100 kb.
  • Chromosome 21 contains 37 Mb.
  • The final step involves assembling the overlapping clone sequences into the entire genomic sequence. For example, chromosome 21 is 37 Mb of the total 3.2 × 109 nucleotides.

Celera Genomics Project Strategy

  • Celera used a "shotgun approach," breaking the human genome into fragments without identifying the origin of each fragment.
  • Fragments were copied to generate many clones, then cut into 500-base-long pieces and modified with fluorescently labeled bases for sequencing using high-speed machines.
  • The sequences were reassembled by identifying overlapping ends, a task facilitated by the world’s largest nongovernmental supercomputing center.
  • In 2001, 90% of the human genome sequence had been mapped in 15 months instead of the originally anticipated four years.
  • By October 2004, 99% of the genome was sequenced and declared to be 99.999% accurate.
  • The mapped sequence correctly identifies almost all known genes, allowing researchers to rely on highly accurate sequence information.

27.2 DNA and Chromosomes

  • Understanding DNA structure provides insight into the biotechnology revolution caused by the HGP.

Telomeres and Centromeres

  • Telomeres are specialized DNA regions at both ends of every chromosome.
  • Each telomere consists of a long, noncoding series of repeating nucleotide sequences, (TTAGGG)n.
  • Telomeres act as protective "endcaps," preventing changes to the DNA coding sequences.
  • Telomeres prevent DNA ends from fusing to other chromosomes or DNA fragments.
  • New cells start with a long stretch of telomeric DNA.
  • Telomeres shorten with each cell division as some of the repeating sequence is lost.
  • A very short telomere is associated with senescence.
  • Continued shortening leads to DNA instability and cell death.
  • Telomerase increases telomere length in DNA and is active during embryonic development and in germ cells of adults.
  • Telomere shortening is speculated to play a role in aging.
  • Experiments with mice lacking telomerase activity show premature aging, and embryos do not survive.
  • Most cancer cells contain active telomerase, conferring immortality to tumor cells.
  • Research suggests that genes regulating telomerase expression are altered in cancer cells, and experiments are ongoing to study the effects of telomerase inactivation on cancer cells.
  • As the DNA in each chromosome duplicates for cell division, the two copies remain joined at the centromere, a constricted point in the middle of the chromosome.
  • The duplicated chromosomes bound together at the centromere are called sister chromatids.

Noncoding DNA

  • Only about 1.5% of the genome codes for proteins.
  • Noncoding promoter sequences are regulatory regions that determine which genes are turned on.
  • Only genes needed by a cell are activated in that cell.
  • Out of 20,500 genes, only about 2000 are expressed in a given cell.
  • Noncoding DNA segments may be needed for DNA folding within the nucleus or may play a role in evolution.
  • Some scientists believe these segments are functional but their functions are not yet understood.
  • Debate continues over the role of noncoding DNA.

Genes

  • The nucleotides of a gene are not consecutive; coding segments (exons) alternate with noncoding segments (introns).
  • Chromosome 22 was the first to have all its nonrepetitive DNA sequenced and mapped, containing 49 million bases with 693 genes, averaging 8 exons and 7 introns per gene.
  • Chromosome 22 carries genes associated with the immune system, congenital heart disease, schizophrenia, leukemia, cancers, and other genetically related conditions.
  • The map revealed several hundred previously unknown genes.
  • With the signal (exon) to noise (intron) ratio being so low, it will be challenging to completely identify all the coding sequences present.

27.3 Mutations and Polymorphisms

  • Mutation: An error in base sequence that is carried along during DNA replication.
  • Commonly refers to variations in DNA sequence found in a very small number of individuals of a species.
  • An error in nucleic acid composition that occurs once in 3–4 million lobsters is responsible for the beautiful color of this crustacean.

Types of Mutations

  • Point mutations: A single base change.
    • Silent: A change that specifies the same amino acid (e.g., GUU → GUC, gives Val → Val).
    • Missense: A change that specifies a different amino acid (e.g., GUU → GCU gives Val → Ala).
    • Nonsense: A change that produces a stop codon (e.g., CGA → UGA gives Arg → Stop).
    • Frameshift: The number of inserted or deleted bases is not a multiple of 3, so that all triplets following the mutation are read differently.
    • Insertion: Addition of one or more bases.
    • Deletion: Loss of one or more bases.
  • Some mutations result from spontaneous and random events.
  • Error rate of replication is about 1 in 1 billion.
  • Others are induced by exposure to a mutagen, such as viruses, chemicals, and ionizing radiation.
  • Polymorphisms: Variations in the nucleotide sequence of DNA that are common within a given population.
  • Most polymorphisms are simply differences in the DNA sequence between individuals due to geographical and ethnic differences and are part of the biodiversity exhibited by life on Earth.
  • The vast majority of polymorphisms seen have neither advantageous nor deleterious effects; some have been shown to give rise to various disease states.

Common Hereditary Diseases

  • Phenylketonuria (PKU): Brain damage in infants caused by the defective enzyme phenylalanine hydroxylase (1 in 40,000).
  • Albinism: Absence of skin pigment caused by the defective enzyme tyrosinase (1 in 20,000).
  • Tay-Sachs disease: Mental retardation caused by a defect in production of the enzyme hexosaminidase A (1 in 6000 Ashkenazi Jews; 1 in 100,000 general population).
  • Cystic fibrosis: Bronchopulmonary, liver, and pancreatic obstructions by thickened mucus; defective gene and protein identified (1 in 3000).
  • Sickle-cell anemia: Anemia and obstruction of blood flow caused by a defect in hemoglobin (1 in 185 African Americans).
  • Other diseases:
    • Neurofibromatosis, Type 2
    • Muscular Dystrophy
    • Hemophilia
    • Blood fails to clot
    • Gaucher's Disease
    • Down Syndrome
    • Amyotrophic Lateral Sclerosis
    • ADA Deficiency
    • Familial Hypercholesterolemia
    • Myotonic Dystrophy
    • Amyloidosis
    • Breast Cancer
    • Polycystic Kidney Disease
    • Tay-Sachs Disease
    • Alzheimer's Disease
    • Retinoblastoma
    • Familial Colon Cancer
    • Retinitis Pigmentosa
    • Huntington's Disease
    • Familial Polyposis of the Colon
    • Spinocerebellar Ataxia
    • Cystic Fibrosis
    • Malignant Melanoma
    • Multiple Endocrine Neoplasia, Type 2
    • Sickle-Cell Anemia

Single-Nucleotide Polymorphism(SNP) and Disease

  • The replacement of one nucleotide by another in the same location along the DNA sequence is a single-nucleotide polymorphism.
  • The biological effects of SNPs range from negligible to normal variations, such as those in eye or hair color, to genetic diseases.
  • SNPs are the most common source of variations between individual human beings.
  • In addition to producing a change in the identity of an amino acid, a SNP might specify the same amino acid (for example, changing GUU to GUC, both of which code for valine), or it might terminate protein synthesis by introducing a stop codon.
  • Industrial and academic scientists are compiling a catalog of SNPs.
  • Their frequency is roughly one SNP for about every 300 nucleotides, with many of them in coding regions.
  • Knowing their exact locations may one day help doctors to predict an individual’s risk of developing a disease.
  • The SNP catalog has been used to locate SNPs responsible for 30 abnormal conditions, including total color blindness, one type of epilepsy, and susceptibility to the development of breast cancer.
  • As of June 2015, the SNP catalog maintained by the National Human Genome Research Institute contains over 147 million SNP entries.
  • The cataloging of SNPs has ushered in the era of genetic medicine.
  • The SNP catalog may allow physicians to predict for an individual the potential age at which inherited diseases will become active, their severity, and their reactions to various types of treatment.
  • The therapeutic course will be designed to meet the distinctive genomic profile of the person.
  • Ancestry.com – Where are you from?
  • 23 and Me – Where are you from? – Test for some diseases.
  • Incomplete test (dozens of SNP’s or mutations are known for some genetic diseases).

Worked Example 27.1

  • The severity of a mutation in a DNA sequence that changes a single amino acid in a protein depends on the type of amino acid replaced and the nature of the new amino acid.
  • Exchange of an amino acid with a small nonpolar side chain for another with the same type of side chain (e.g., glycine for alanine) or exchange of amino acids with very similar side chains (e.g., serine for threonine) might have little effect.
  • Conversion of an amino acid with a nonpolar side chain to one with a polar, acidic, or basic side chain could have a major effect because the side-chain interactions that affect protein folding may change.
  • Some examples of this type include exchanging threonine, glutamate, or lysine for isoleucine.
  • In hemoglobin, a single replacement of glutamic acid with a valine leads to sickle-cell anemia.

27.4 Recombinant DNA

  • Recombinant DNA: DNA that contains two or more DNA segments not found together in nature.
  • Technology that predates the Human Genome Project.
  • Progress in all aspects of genomics has built upon information gained in the application of recombinant DNA.
  • Using recombinant DNA technology, it is possible to cut a gene out of one organism and splice it into (recombine it with) the DNA of a second organism.
  • Bacteria provide excellent hosts for recombinant DNA.
  • Often have one large circular DNA with all its genes.
  • Bacterial cells contain part of their DNA in small circular pieces called plasmids, each of which carries just a few genes.
    • Can be passed from one bacterium to another.
    • Many carry antibiotic resistance genes.
  • Plasmids are extremely easy to isolate, several copies of each plasmid may be present in a cell, and each plasmid replicates through the normal base-pairing pathway. Plasmids from the bacterium Escherichia coli, hosts for recombinant DNA.
  • The ease of isolating and manipulating plasmids plus the rapid replication of bacteria create ideal conditions for production of recombinant DNA and the proteins whose synthesis it directs.
  • The plasmid is cut open with a restriction endonuclease or restriction enzyme, which recognizes a specific sequence.
  • The restriction enzyme makes its cut at the same spot in the sequence of both strands of the double-stranded DNA when read in the same 5’ to 3’ direction.
  • This results in unpaired bases, known as sticky ends because they are available to match up with complementary base sequences.
  • Consider a gene fragment that has been cut from human DNA and is to be inserted into a plasmid.
    • The first step is cutting the gene and plasmid with the same restriction enzyme.
    • The next step is re-forming their phosphodiester bonds with ligase.
  • The altered plasmid is inserted back into a bacterial cell where the normal processes of transcription and translation synthesize the protein encoded by the inserted gene.
  • Bacteria multiply rapidly; there are soon a large number of them, all containing the recombinant DNA and all manufacturing the protein encoded by the recombinant DNA.
  • There are some technical hurdles that have to be overcome before a protein manufactured in this way can be used commercially. They include the following:
    • The recombinant plasmid must be inserted into a bacterium.
    • Host organisms may modify the protein. Glycosylation.
    • The protein of interest must be isolated from endotoxins—potentially toxic natural compounds found inside the host organism.
  • Despite the obstacles, proteins manufactured in this manner have already reached the marketplace, including human insulin, human growth hormone, and blood clotting factors for hemophiliacs.
  • A major advantage of this technology is that large amounts of these proteins can be made, thus allowing their practical therapeutic use.

27.5 Genomics: Using What We Know

Genetically Modified Plants and Animals

  • The development of new varieties of plants and animals has been proceeding for centuries as the result of natural accidents and occasional success in the hybridization of known varieties.
  • The mapping and study of plant and animal genomes can greatly accelerate our ability to generate crop plants and farm animals with desirable characteristics and lacking undesirable ones.
  • Some genetically modified crops are planted in large quantities in the United States.
    • Each year, millions of tons of corn are destroyed by the European corn borer. To solve this problem, a bacterial gene (from Bacillus thuringiensis, Bt) has been transplanted into corn. The gene causes the corn to produce a toxin that kills the caterpillars.
  • Tests are under way with genetically modified coffee beans that are caffeine-free, potatoes that absorb less fat when they are fried, and “Golden Rice,” a yellow rice that provides the vitamin A desperately needed in poor populations where insufficient vitamin A causes death and blindness.
  • Will genetically modified plants and animals intermingle with natural varieties and cause harm to them?
  • Should food labels state whether the food contains genetically modified ingredients?
  • Might unrecognized harmful substances enter the food supply?
  • These hotly debated questions have led to the establishment of the Non-GMO Project, the goal of which is to offer consumers a non-GMO choice for organic and natural products.

Gene Therapy

  • Gene therapy is based on the premise that a disease-causing gene can be corrected or replaced by inserting a functional, healthy gene.
  • The most clear-cut expectations for gene therapy lie in treating monogenic diseases, those that result from defects of a single gene.
  • The focus has been on using nonpathogenic viruses as vectors, the agents that deliver therapeutic quantities of DNA directly into cell nuclei.
  • The expectation was that this method could result in lifelong elimination of an inherited disease, and many studies have been undertaken.
  • Expectations remain greater than achievements thus far.
  • The Food and Drug Administration (FDA) has, as of 2014, not yet approved any human gene therapy product for sale.
  • Early vector was AAV. Apparently caused an allergic reaction.
  • CRISPER-Cas-9 Targeted modification of specific sequences in living cells.

A Personal Genomic Survey

  • If a patient lacks an enzyme needed for a drug’s metabolism or has a monogenic defect, therapies could be individually tailored.
  • In cancer therapy, understanding the genetic differences between normal cells and tumor cells could assist in chemotherapy. Immune cell therapy CAR-T cells.
  • Genetic screening of infants might permit the use of gene therapy to eliminate the threat of a monogenically based disease, or a lifestyle adjustment for an individual with SNPs that predict a susceptibility to a disease that results from combinations of genetic and environmental influences.

Bioethics

  • One area of major concern that has arisen from the genomics revolution is that of the ethical and social implications this groundbreaking work has brought to the fore.
  • The ELSI program of the National Human Genome Research Institute deals with the ethical, legal, and social implications of human genetic research.
  • The scope of ELSI is broad and thought- provoking. It deals with many questions such as the following:
    • Who should have access to personal genetic information and how will it be used?
    • Who should own and control genetic information?
    • Should genetic testing be performed when no treatment is available?
    • Are disabilities diseases? Do they need to be cured or prevented?
    • Preliminary attempts at gene therapy are exorbitantly expensive. Who will have access to these therapies? Who will pay for their use?
    • Should we re-engineer the genes we pass on to our children?
    • Should we get every newborn’s full genetic sequence?