Looks like no one added any tags here yet for you.
7.34 Distinguish between prokatyoyic insertion elements and transposons. How do composite transoisinfs differ from noncomposite transposons?
7.1 Mutations are (choose the correct answer)
a. caused by genetic recombination
b. heritable chances in genetic information
c. caused by faulty transcription of the genetic code
d. usually, but not always, benefical to the development of the individuals in which they occur
b
7.2 Answer true or false: Mutations occur more frequently when there is no need for them
False. Mutations occur spontaneously at a more or less frequency, regardless of selective pressure. Once they occur, however, they can be selected for or against based on the advatnages, frequency is different than selections
7.3 Which of the following is not a class of mutation
a. frameshift
b.missense
c.transversion
d. transition
e.none of the above; all are classes of mutation
e
7.4 Ultraviolet light, ususally causes mutations by a mechanism involving(choose the correct answer)
a. one-strand breakage in DNA
b.light induced change of thyamine to alkylated guanine.
c. induction of thymine dimers and their persistence or imperfectr repair
d. inversion of DNA segments
e. all of the above
c
the amino acid sequence shown in the following table was obtained from the center region of a patticular polypeptide chain in the wild type and several mutant bacteria stains : see image, for each mutant say what change has occurred at the DNA level, whether the change is a base-pair substitution mutation(transversion or transition,missense or nonsense) or a frameshift mutation ,and in which codon the mutation occurred .(refer to codon dictionary)
Mut1: alteration of amino acids after codon 2 suggests a frameshift mutation, likely a base pair deletion or addition in the DNA near codons 2 and 3. For example, inserting an A between two Cs in CCN could produce CAY (His), resulting in:
Mutant 2: Codon 5 encodes Met instead of Val, likely caused by a point mutation. Changing G in GUG to A results in AUG (Met) through a CG-to-TA transition in the DNA.
Mutant 3: Though the amino acid sequence is normal, premature termination indicates a nonsense mutation at codon 9. Substituting or inserting an A near the first G could create UAG or UGA. This could result from either a CG-to-TA transition or a frameshift.
Mutant 4: Premature termination occurs at codon 5 (nonsense codon), with missense mutations at codons 2 and 4. Comparing mutant and normal sequences can reveal if a single mutation explains all these changes.Codon 2 changes from Leu (CUN) to Pro (CCN). A deletion of U (via an AT deletion in the DNA) and a change in N to C would produce CCC (Pro) and a frameshift. Codon 3 becomes CCA (Pro), codon 4 becomes CGG (Arg), and codon 5 becomes UGA (nonsense).
Mutant 5: The fourth amino acid changes from Thr to Ser, indicating a point mutation. Changing the fourth codon from ACG to UCG (via a TA-to-AT transversion in the DNA) causes this missense mutation.
In mutant strain X of E. coli, a leucine tRNA that
recognizes the codon 5¿-CUG-3¿ in normal cells has
been altered so that it now recognizes the codon
5¿-GUG-3¿. A missense mutation that affects amino acid
10 of a particular protein is suppressed in mutant X cells.
a. What are the anticodons of the two Leu tRNAs, and
what mutational event has occurred in mutant X
cells?
b. What amino acid would normally be present at
position 10 of the protein (without the missense
mutation)?
c. What amino acid is put in at position 10 if the missense
mutation is not suppressed (i.e., in normal cells)?
d. What amino acid is inserted at position 10 if the missense mutation is suppressed (i.e., in mutant X cells)
Answer:
a. If the normal codon is 5'-CUG-3', the anticodon of the normal tRNA is 5'-CAG-3’. If a mutant tRNA recognizes 5'-GUG-3’, it must have an anticodon that is 5’-CAC-3’. The mutational event was a CG-to-GC transversion in the gene for the leucine tRNA. The mutant tRNA will carry leucine to a codon for valine.
b. Since a leucine-bearing (mutant) tRNA can suppress the mutation, presumably leucine is nor- mally present at position 10.
c. The mutant tRNA recognizes the codon 5'-GUG-3’, which codes for Val. In normal cells, a Val- tRNA Valine would recognize the codon and insert valine. ;
d. Leu
A researcher using a model eukaryotic experimental system has identified a temperature-sensitive mutation, rpIIAts, in a gene that encodes a protein subunit of RNA polymerase II. This mutation is a missense mutation.Mutants have a recessive lethal phenotype at the higher,restrictive temperature, but grow at the lower, permissive (normal) temperature. To identify genes whose products interact with the subunit of RNA polymerase II, the researcher designs a screen to isolate mutations that will act as dominant suppressors of the temperature-sensitive recessive lethal mutation.
a. Explain how a new mutation in an interacting protein could suppress the lethality of the temperature-sensitive original mutation.
b. In addition to mutations in interacting proteins, whatother type of suppressor mutations might be found?
c. Outline how the researcher might select for the new suppressor mutations.
d. Do you expect the frequency of suppressor mutations to be similar to, much greater than, or much less thanthe frequency of new mutations at a typical eukaryotic gene?
e. How might this approach be used generally to identify genes whose products interact to control transcription?
a. The temperature sensitivity of the rpIIA* mutant could be due to a single amino acid change in the protein subunit of RNA polymerase II that causes it to be nonfunctional at the restrictive tem- perature. If its inability to function is due to a change in its secondary or tertiary structure that
prevents it from interacting with another protein during transcription, then a mutation in this second interacting protein might be compensatory. Such a mutation would effectively suppress
the initial mutation, as it would allow for transcription even at the restrictive temperature.
b. New mutation in a second protein would be an intergenic suppressor mutation. The original mutation could also be suppressed by reverting the missense mutation or by an intragenic suppressor mutation. In an intragenic suppressor mutation, a particular second site within the protein would need to be mutated so as to compensate for the initial mutation.
c. One approach is to treat rpIIA*/rpIIA® individuals with a mutagen and mate them to rplIA**/rpIIA* individuals at the permissive temperature. A second-site suppressor could be selected for by removing the parents and shifting the progeny to the restrictive temperature. Since all of the offspring are rpIIA*/rpIIA*, only offspring carrying a new mutation capable of suppressing the recessive lethality of the rpIIA® mutation will survive.
d. Second-site suppressors will be quite rare and will appear at a much lower frequency than will mutations in a typical eukaryotic gene. To suppress a particular defect, a very specific new mutation must occur. Hence, the vast majority of mutations that are induced by a mutagen will lack the specific compensatory ability of a suppressor mutation.
e. Since intergenic suppressors may result from mutations in interacting proteins, this approach could be used to identify genes for proteins that interact during transcription.
7.8 he mutant lacZ-1 was induced by treating E. colicells with acridine, whereas lacZ-2 was induced with 5BU. What kinds of mutants are these likely to be? Explain. How could you confirm your predictions by studying the structure of the -galactosidase in these cells?
cridine is an intercalating agent that induces frameshift mutations. lacZ-1 probably is a
frameshift mutation that results in a completely altered amino acid sequence after some point, although it might be truncated due to the introduction of an out-of-frame nonsense codon. In either case, the protein produced by lacZ-1 would most likely have a different molecular weight and charge. During gel electrophoresis (see text Figure 4.8, p. 70), it would migrate differently than the wild-type protein, 5BU is incorporated into DNA in place of T. During DNA replication, it can be read as C by DNA polymerase because of a keto-to-enol shift. This results in point mutations, usually TA-to-CG transitions. lacZ-2 is likely to contain a single amino acid difference, due to a missense mutation; although it, too, could contain a nonsense codon. A missense mutation might lead to the protein's having a different charge, while a nonsense codon would lead to a truncated protein that would have a lower molecular weight. Both would migrate differently during gel electrophoresis.
7.9 SEQUENCE: 5’AUGACCCAUUGGUCUCGUUAG-3’
a. The sequence of nucleotides in an mRNA is Assuming that ribosomes could translate this mRNA,how many amino acids long would you expect the resulting polypeptide chain to be?
b. Hydroxylamine is a mutagen that results in the replacement of an A–T base pair for a G–C base pair inthe DNA; that is, it induces a transition mutation.When hydroxylamine was applied to the organism that made the mRNA molecule shown in part (a), a strain was isolated in which a mutation occurred at the 11th position of the DNA that coded for the mRNA. Howmany amino acids long would you expect the polypeptide made by this mutant to be? Why?
a. The codons read as 5’-AUG-ACC-CAU-UGG-UCU-CGU-UAG-3’ The last codon is a nonsense (chain termination) codon, while the others are sense codons. The chain would be six amino acids long.
b. The new sequence would be 5’-AUG-ACC-CAU-UAG-... Since UAG is a n
7.10 In a series of 94,075 babies born in a particular hospital in Copenhagen, 10 were achondroplastic dwarfs (anautosomal dominant condition). Two of these 10 had anachondroplastic parent. The other 8 achondroplastic babies each had two normal parents. What is the apparentmutation rate at the achondroplasia locus?
There were eight new mutations in 94,073 normal couples. Since the phenotype is dominant, the phenotype is seen when just one of the parental genes is mutated. There were 2 x 94,073 copies of the gene that could have undergone mutation. Therefore, the apparent mutation rate at this locus is 8/(2 X 94,073) = 8/188,146 = 4 x 10-5 mutations per locus per generation
7.17 DNA polymerases from different organisms differ in the fidelity of their nucleotide insertion; however, even the best DNA polymerases make mistakes, usually mismatches. If such mismatches are not corrected, they can become fixed as mutations after the next round of replication.
a. How does DNA polymerase attempt to correct mismatches during DNA replication?
b.What mechanism is used to repair such mismatches if they escape detection by DNA polymerase?
c. How is the mismatched base in the newly synthesized strand distinguished from the correct base in the template strand?
a. Many DNA polymerases have proofreading ability, using 3’-to-5’ exonuclease activity to excise and replace mismatched bases during replication, stalling at errors to ensure accuracy (see Chapters 3 and 7).
b. Shortly after replication, mismatches in the newly synthesized strand can be repaired by enzymes. In E. coli, MutS binds single base mismatches or small additions/deletions, recognizing the new strand by an unmethylated A in 5'-GATC-3'. MutL and MutH form a complex with MutS, bringing the unmethylated site close to the mismatch. MutH nicks the unmethylated strand, and an exonuclease excises the mismatch. DNA polymerase III and ligase then repair the gap (see Figure 7.17).
c. In E. coli, parental strands are distinguished by methylation at 5'-GATC-3'. Hemimethylation shortly after replication allows identification of the new strand, as only the unmethylated strand is nicked by MutH, excised, and resynthesized by DNA polymerase III.
7.34 Distinguish between prokaryotic insertion elements and transposons. How do composite transposons differ from noncomposite transposons?
Insertion elements (IS elements) are simpler than transposons, containing a transposase gene flanked by inverted repeat (IR) sequences. Transposons (Tn elements) are more complex and exist as:
Composite Transposons: A central gene-bearing region flanked by IS elements, which facilitate transposition (e.g., antibiotic resistance genes).
Noncomposite Transposons: Contain genes but lack IS elements, instead terminating with inverted terminal repeats needed for transposition.
Both types integrate into non-homologous target sites, causing target site duplications. Transposition of IS elements and composite transposons requires host cell enzymes and transposase.
Noncomposite transposons use two mechanisms:
Replicative Transposition (e.g., Tn3): Creates a cointegrate between the transposon and recipient DNA. Tn3 genes encode transposase and resolvase for this process.
Conservative Transposition (e.g., Tn10): Moves without replication.
All transposable elements can cause mutations upon insertion.
7.35 What properties do bacterial and eukaryotic transposable elements have in common?
The structure and, at a general level, the function of eukaryotic and prokaryotic transposable elements are very similar. For example, both Tn and Ac elements have genes within them and have inverted repeats at their ends. Both prokaryotic and eukaryotic elements may affect gene function in a variety of ways, depending on the element involved and how it integrates into or DNA Mutation, DNA Repair, and Transposable Elements 115 nearby a gene. The integration events of eukaryotic elements, like those of most prokaryotic trans- posable elements, involve nonhomologous recombination. Some eukaryotic elements, such as Ty elements in yeast and retrotransposons, move via an RNA intermediate, unlike the IS and Tn prokaryotic elements.
7.36 An IS element became inserted into the lacZ gene ofE. coli. Later, a small deletion occurred that removed 40 base pairs near the left border of the IS element. The deletion removed 10 lacZ base pairs, including the left copy of the target site, and the 30 leftmost base pairs of the IS element. What will be the consequence of this deletion?
The left inverted repeat of the IS element has been removed, so that the two ends of this IS element are no longer homologous. The element will not be able to move out of this location and insert into another site.
8.1 Before a genome is sequenced, its DNA must be cloned. What is meant by a DNA clone, and what materials and steps are used to clone genomic DNA?
A DNA clone is a segment of DNA inserted into a vector (e.g., plasmid, phage, BAC, or YAC) and replicated to produce identical copies within host cells.
Cloning genomic DNA involves these steps:
Isolate genomic DNA.
Partially digest it with a restriction enzyme to create sticky ends.
Select DNA fragments of the desired size via agarose gel electrophoresis.
Anneal the sticky ends of the DNA fragments to a similarly cut cloning vector.
Seal nicks with DNA ligase to form a recombinant molecule.
Transform the recombinant DNA into a host cell, propagate the cells, and purify the DNA clone.
8.3 Restriction endonucleases are naturally found in bacteria. What purposes do they serve?
Restriction enzymes serve to protect their hosts from infection by invading viruses and degrade any potentially infectious foreign DNA taken up by the cell (for example, by transformation). Since restriction enzymes digest DNA (restrict it) at specific sites, any foreign DNA will be cut up. To protect its own DNA from digestion by its restriction enzyme(s), a bacterium modifies (methylates) the sites recognized by its own restriction enzymes. This prevents cleavage at these sites.
8.4 A new restriction endonuclease is isolated from a bacterium. This enzyme cuts DNA into fragments that average 4,096 base pairs long. Like many other known restriction enzymes, the new one recognizes a sequence in DNA that has twofold rotational symmetry. From the information given, how many base pairs of DNA constitute the recognition sequence for the new enzyme?
a. The enzyme recognizes a sequence that has two G-C base pairs, two C-G base pairs, one A-T base pair, and one T-A base pair in a particular order. Since 40% of the genome is composed of G-C base pairs, the chance of finding a G-C or C-G base pair is 0.20, and the chance of finding an A-T or a T-A base pair is 0.30. The chance of finding these six base pairs with this sequence is (0.20)* x (0.3)? = 0.000144. A genome with 3 x 10° base pairs will have about 3 x 10° differ- ent groups of 6-bp sequences. Thus, the number of sites in the human genome is (0.000144) x (3 X 10°) = 432,000. ;
b. 3 X 10’ bp/432,000 sites = 1/0.000144 = 6,944 bp between sites.
c.The chance of finding these six base pairs in a sequence having 80% A-T base pairs is (0.10)* x (0.4)? = 0.000016, so two Avril sites will be 1/0.000016 = 62,500 bp apart.
*8.8 What features are required in all vectors used to propagate cloned DNA? What different types of cloning vectors are there, and how do these differ from each other?
Cloning vectors have three essential features:
Replication Ability: An origin of replication (e.g., ori in plasmids, ARS in YACs) allows them to replicate in host cells.
Selective Marker: Enables selection in host cells (e.g., antibiotic resistance in plasmids, auxotrophic markers in YACs).
Unique Restriction Sites: For DNA insertion.
Types of vectors include:
Plasmids: Hold <10 kb of DNA, replicate at high copy numbers in bacteria, and are used for various applications, including shotgun cloning of 2-10 kb genome inserts.
BACs: Hold up to 300 kb, are single-copy in bacteria, and preferred for physical genome mapping due to stability (no rearrangements).
YACs: Hold 0.2–2 Mb, include CEN sequences for proper segregation, and are used for genome mapping. However, they can rearrange or become chimeric, limiting their utility.
Limitations:
E. coli vectors struggle with AT-rich sequences or toxic sequences.
YACs are prone to rearrangements and often contain DNA from multiple genome sites.
Three students are working as a team to construct a plasmid library from Neurospora genomic DNA. They want the library to have, on average, about 4-kb inserts. Each student proposes a different strategy for constructing the library, as follows:
Mike: Cleave the DNA with a restriction enzyme that recognizes a 6-bp site, which appears about once every 4,096 bp on average and leaves sticky, overhanging ends. Ligate this DNA into the plasmid vector cut with the same enzyme and transform the ligation products into bacterial cells.
Marisol: Partially digest the DNA with a restriction enzyme that cuts DNA very frequently, say once every 256 bp, and that leaves sticky overhanging ends. Select DNA that is about 4 kb in size (e.g., purify fragments this size after the products of the digest are resolved by gel electrophoresis). Then, ligate this DNA to a plasmid vector cleaved with a restriction enzyme that leaves the same sticky overhangs and transform the ligation products into bacterial cells.
Hesham: Irradiate the DNA with ionizing radiation, which will cause double-stranded breaks in the DNA. Determine how much irradiation should be used to generate, on average, 4-kb fragments and use this dose. Ligate linkers to the ends of the irradiated DNA, digest the linkers with a restriction enzyme to leave sticky overhanging ends, ligate the DNA to a similarly digested plasmid vector, and then transform the ligation products into bacterial cells.
Which student’s strategy will ensure that the inserts are representative of all of the genomic sequences? Why are the other students’ strategies flawed?
Marisol’s strategy will ensure that the inserts are representative of all of the genomic sequences. Partial digestion of genomic DNA will generate a population of overlapping fragments representative of the entire genome. When the library is screened, multiple overlapping clones from a region will be identified. Mike’s strategy works in principle, but in practice has drawbacks. Analyzing a region requires each adjacent restriction fragment from that region to be cloned and recovered in a screen of a genomic library. Given the small size of the restriction fragments, the library will need to contain a very large number of clones, and screening the library to find all of the adjacent clones in a region will be very laborious. In addition, large genes will be split into multiple pieces. Hesham’s strategy is the least desirable. While using ionizing radiation to introduce double-strand breaks will result in the random fragmentation of DNA and produce a population of overlapping genomic DNA fragments, it will also introduce other types of DNA damage (see text Chapter 7). Damage to the DNA may prevent its successful cloning, and sequences that can be cloned are unlikely to be identical to the genomic DNA, because bacterial DNA repair processes will lead to alterations in the DNA sequence.
8.16 The human genome contains about bp of DNA. How many 200-kb fragments would you have to clone into a BAC library to have a 90% probability of including a particular sequence?
From the text, N = In(1 — p)/In(1 — f), where N is the necessary number of recombinant
DNA molecules, p is the probability of including one particular sequence, and f is the fractional proportion of the genome in a single recombinant DNA molecule. Here, p = 0.90 and
f = (2 X 10°)/( X 10’), so N = 34,538.
*8.18 When Celera Genomics sequenced the human genome, they obtained 13,543,099 reads of plasmids having an average insert size of 1,951 bp, and10,894,467 reads of plasmids having an average insert size of 10,800 bp.
a. Dideoxy sequencing provides only about 500–550 nucleotides of sequence. About how many nucleotides of sequence did cetera obtain from sequencing these two plasmid libraries? To what fold coverage does this amount of sequence information correspond?
b. Why did they sequence plasmids from two libraries with different-sized inserts?
c. They sequenced only the ends of each insert. How did they determine the sequence lying between the sequenced ends?
Each dideoxy sequencing reaction recovers about 500 nucleotides. For a genome with 13,543,099 + 10,894,467 nucleotides, approximately 500×(13,543,099+10,894,467)=1.22×1010500 \times (13,543,099 + 10,894,467) = 1.22 \times 10^{10} nucleotides were sequenced, providing roughly fourfold coverage (1.22×1010÷3×1091.22 \times 10^{10} \div 3 \times 10^{9}).
Sequencing Strategy:
Libraries with inserts of different sizes are required to assemble sequences around repetitive DNA.
A plasmid with a 2-kb insert might have a unique sequence at one end and a repetitive sequence at the other, preventing assembly beyond the repetitive region.
Using plasmids with 10-kb inserts solves this problem. These inserts may overlap the unique sequence in the 2-kb plasmid at one end while extending past the repetitive sequence at the other, enabling further assembly.
Sequence Assembly:
The central region sequence is reconstructed by aligning overlapping clones during assembly.
8.20 Explain how the whole-genome shotgun approach to sequencing a genome differs from the biochemist’s approach described in Question 8(c). What information does it provide that the biochemist’s approach does not? What does it mean to obtain 7-fold coverage, and why did his colleague advise him to do this?
The whole-genome shotgun approach rapidly sequences many 500-nucleotide fragments and uses algorithms to assemble overlapping reads into continuous sequences for each chromosome. This method sequences clones with varying insert sizes to overcome challenges posed by repetitive elements and ensures efficient genome assembly.
In contrast, the stepwise biochemist’s approach is slower and limited. It sequences clones of a single size, making it difficult to align sequences containing repetitive elements, and cannot assemble entire genomes.
Coverage:
Coverage measures the total sequenced nucleotides relative to the genome's size. Sevenfold coverage means sequencing seven times the genome's length, reducing gaps and increasing the reliability of the assembled sequence by facilitating error correction through overlapping reads. Higher coverage improves sequence accuracy and minimizes errors.
8.23 Do all SNPs lead to an alteration in phenotype? Explain why or why no
Not all SNPs lead to an alteration in phenotype. Some are silent. For example, if a SNP does not lie in a DNA sequence that is transcribed, or does lie in a transcribed sequence but after mRNA processing does not alter the amino acid inserted into a polypeptide chain, it will not cause a missense or nonsense mutation and could be silent. If a SNP does also not lie in a gene regula- tory region, it will not affect a gene’s function and could also be silent.
.28 Three of the steps in the analysis of a genome’s sequence are assembly, finishing, and annotation. What is involved in each step, and how do they differ from each other?
Genome assembly aligns overlapping sequences to reconstruct the order of DNA in the genome. When sequences end with repetitive elements, other clones with unique sequences spanning the repetitive regions are used to complete the assembly. This process creates a draft genome, which contains gaps and errors.
Finishing:
The finishing stage corrects errors and fills gaps, producing a highly accurate sequence with fewer than one error per 10,000 bases and minimal gaps.
Annotation:
Annotation identifies genes, repetitive elements, and SNPs. Genes are located by comparing genomic sequences to cDNA or predicted using algorithms. SNPs are found by comparing sequences from different individuals. Annotation assigns functions to genes and other sequence features.
8.37In which type of organisms does gene number appear to be related to genome size? Explain why this is not
the case in all organisms.
In Bacteria and Archaea, most of the genome consists of genes, and there is a strong correlation between the number of genes and genome size due to high gene density. However, in Eukarya, gene density varies widely, and complexity tends to decrease gene density as organisms become more complex.
For example:
Yeast (S. cerevisiae) has a genome of about 12 Mb with 5,700 protein-coding genes.
Nematodes (C. elegans) have a genome of 100 Mb with 20,443 genes.
Fruit flies (D. melanogaster) have a genome of 178 Mb with 14,015 genes.
Although worms and flies have genomes that are 12.5 and 14.8 times larger than yeast, they only have 3.6 and 2.5 times as many genes, respectively. Worms also have 45% more genes than flies, despite having 43% less DNA. Therefore, in Eukarya, the number of genes is not directly proportional to genome size.
What is bioinformatics, and what is its role in functional and comparative genomics?
Bioinformatics is the application of computers and information science to analyze genetic data and biological structures. It merges biology, mathematics, computer science, and information science. This field is crucial for tasks like assembling genome sequences, identifying genes in databases, aligning sequences to check for similarities, predicting gene structures, and locating potential genes. In functional genomics, bioinformatics helps predict gene functions by comparing DNA or protein sequences and analyzing how gene functions affect the transcriptome or proteome. In comparative genomics, it compares gene sequences and entire genomes to identify similarities and differences.
9.2 What is the difference between a gene and an ORF?How might you identify the functions of ORFs whose functions are not yet known?
A gene is a DNA sequence that includes both the transcribed sequence and regulatory regions, such as the promoter, that control its transcription. Genes produce various types of RNA (like mRNA, rRNA, tRNA, etc.) and proteins. Functionally, genes are identified by mutations that affect or eliminate their product functions. In contrast, an open reading frame (ORF) is a segment of mRNA that codes for a protein. Unlike genes, ORFs do not include regulatory or untranscribed sequences, nor do they contain introns.
To determine the function of an unknown ORF, two main approaches are used. First, computerized sequence searches, such as BLAST, compare the ORF sequence to a database of known sequences to identify similarities that may suggest a similar function. This approach might only give partial information, as proteins often have multiple functional domains. Second, experimental approaches can investigate the gene's function by analyzing knockout mutations and observing the resulting phenotype. In humans, where such experiments might be unethical, studies in model organisms with similar genes can be used to explore the gene's role, including confirming that the ORF encodes a protein and characterizing its function.
9.5 What is meant by a conserved domain? Give an example to illustrate how identifying conserved domains within a protein can provide clues about its function.
A protein domain is a segment of a polypeptide that can fold and function independently. A conserved domain is a sequence of amino acids in a polypeptide that is similar across different polypeptides with similar functions. Domains give polypeptides specific functions, and many polypeptides have multiple domains. Identifying conserved domains can suggest the protein's function. For example, finding a DNA-binding domain suggests the protein may interact with DNA, and if it also has a domain that interacts with transcription factors, it might regulate transcription.
9.7a What is a single orphan gene? What is an orphan family
single orphan gene is a gene whose open reading frame (ORF) has an unknown function and for which homologs have not been identified in another organism. An orphan family is a set of
homologous genes whose function is unknown.
9.8 What information and materials are needed to amplify a segment of DNA using PCR?
To amplify a specific region, one needs to know the sequences flanking the target region so that primers able to amplify the target region can be designed. Once primers are synthesized, the polymerase chain reaction can be assembled. It contains a DNA template (genomic DNA, cDNA, or cloned DNA), the pair of primers that flank the DNA segment targeted for amplification, a heat-resistant DNA polymerase (Taq), the four dNTPs (dATP, dTTP, dGTP, and dCTP), and an appropriate buffer (see text Figure 9.3, p. 222).
9.9In the polymerase chain reaction (PCR), a DNApolymerase that can withstand short periods at very high (near boiling) temperatures is used. Why?
The polymerase chain reaction (PCR) uses repeated cycles of denaturation, annealing, and extension. In denaturation, the DNA template is heated to 94°C to separate the strands. The annealing and extension steps occur at lower temperatures. PCR uses a heat-stable DNA polymerase, like Taq polymerase from Thermus aquaticus, a bacterium that thrives in hot springs. This enzyme can withstand the high temperatures needed for denaturation, so no additional polymerase is required during the cycles.
9.10 Both PCR and cloning allow for the production of many copies of a DNA sequence. What are the advantages of using PCR instead of cloning to amplify a DNA template?
CR is a much more sensitive and rapid technique than cloning. Many millions of copies of a DNA segment can be produced from one DNA molecule in only a few hours using PCR. In contrast, cloning requires more DNA (ng to 1g quantities) for restriction digestion and at least several days to proceed through all of the cloning steps.
9.11 If you assume that each step of the PCR process is 100% efficient, how many copies of a template would be amplified after 30 cycles of a PCR reaction if the number of starting template molecules were
a. 10?
b. 1,000?
c. 10,000?
To understand the given answers, let's first break down the situation:
In the 30th cycle of PCR, you will have approximately 2.68 × 10^9 amplimers. The number of amplimers will depend on how many initial template molecules you start with.
If you start with:
a) 10 copies of DNA, the total number of amplimers after 30 cycles will be:
10×2.68×109=2.68×1010 amplimers10 \times 2.68 \times 10^9 = 2.68 \times 10^{10} \text{ amplimers}
b) 1,000 copies of DNA, the total number of amplimers after 30 cycles will be:
1,000×2.68×109=2.68×1012 amplimers1,000 \times 2.68 \times 10^9 = 2.68 \times 10^{12} \text{ amplimers}
c) 10,000 copies of DNA, the total number of amplimers after 30 cycles will be:
10,000×2.68×109=2.68×1013 amplimers10,000 \times 2.68 \times 10^9 = 2.68 \times 10^{13} \text{ amplimers}
Now, the experimental observation states that 5 ng of DNA (which corresponds to about 2.3 × 10^7 copies of a 200-bp fragment) is readily detected on an agarose gel. To see the amplimers on the gel, you need at least this amount of DNA.
Therefore, if you're starting with more template DNA, the total amplimer amount will increase proportionally, and each of these options shows how the number of initial template molecules impacts the final amount of amplimers in the 30th cycle.
9.18 Comparative genomics offers insights into the relationship between homologous genes and the organization of genomes. When the genome of C. elegae quenced, it was striking that some types of sequences were distributed nonrandomly. Consider the data obtained for chromosome V and the X chromosome shown below. The following figure shows the distribution of genes, the distribution of inverted and tandem repeat sequences, and conserved genes (the location of transcribed sequences in C. elegans that are highly similar toyeast genes).
a. On chromosome V and the X chromosome, genes are distributed uniformly. However, especially on chromosome V, conserved genes are found more frequently in the central regions. In contrast, inverted and tandem-repeat sequences are found more frequently on the arms. It appears that at least on chromosome V, there is an inverse relationship between the frequency of inverted and tandem repeats and the frequency of conserved genes.
b. Since there are fewer conserved genes on the arms, there appears to be a greater rate of change on chromosome arms than in the central regions.
c. Yes, since increased meiotic recombination provides for greater rates of exchange of genetic material on chromosomal arms.
*9.19 How does a cell’s transcriptome compare with its proteome?
a. For a specific eukaryotic cell, can you predict which has more total members? Can you predict which has more unique members?
b. Suppose you are interested in characterizing changes in the pattern of gene expression in the mouse nervous system during development. Describe how you would efficiently assess changes in the transcriptome from the time the nervous system forms during embryogenesis to its maturation in the adult.
c. How would your analyses differ if you were studying the proteome?
Here’s a simplified version:
The transcriptome refers to all the RNA present in a cell at a specific time, while the proteome refers to all the proteins present at that time.
a. The proteome likely has more total and unique members. This is because multiple copies of a protein can be made from one mRNA transcript. Also, a single transcript can produce different protein forms through post-translational modifications, such as phosphorylation or glycosylation, leading to more unique proteins.
b. To analyze gene expression, DNA chips can be used. RNA from different stages of development can be isolated, labeled with different colored dyes, and then hybridized to a DNA chip. The ratio of fluorescence (red to green) at each spot on the chip indicates the relative gene expression in each stage.
c. To study the proteome, proteins from different developmental stages can be isolated and their abundance measured using techniques like quantitative two-dimensional gel electrophoresis, which allows comparison of protein levels at different stages.
3.23 ligning DNA sequences within databases to determine the degree of matching(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
SAFC
3.23 Identification and description of putative genes and other important sequences within a sequenced genome (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
AFC
Characterizing the transcriptome and proteome present in a cell at a specific developmental stage or in a particular disease state(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F
Preparing a genomic library containing 2-kb and
10-kb inserts(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
S
Comparing the overall arrangements of genes and non-gene sequences in different organisms to understand how genomes evolve(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
C
Describing the function of all genes in a genome(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F
Determining the functions of human genes by
studying their homologs in nonhuman organisms (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F,C
Developing a capture array(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F
Developing a physical map of a genome(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
S
Developing DNA microarrays (DNA chips)(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F,A,C
Obtaining a working draft of a genome sequence by assembling DNA linkages to overlap(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
S
(S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F
Whole-genome shotgun sequencing of a DNA sample isolated from a bacterial community growing in a hot spring in Yellowstone National Park (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
SC
Identifying homologs to human disease genes in organisms suitable for experimentation (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F,C
Identifying a large collection of SNP DNA markers within one organism (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
SA
Cloning and sequencing cDNAs from one organism (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
A
Using a Virochip to characterize a new infection (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
C
Making gene knockouts and observing the phenotypic changes associated with them (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
F
Using microarray analysis to type SNPs in a population of individuals (S, cloning and sequencing; A, annotation; F, functional genomics; C, comparative genomics)
C
10.1 Much effort has been spent on developing cloning vectors that replicate in organisms other than E. coli.
a. Describe several different reasons one might want to clone DNA in an organism other than E. coli.
b. What is a shuttle vector, and why is it used?
c. Describe the salient features of a vector that could be used for cloning DNA in yeast.
a. Vectors are tools used to introduce DNA into yeast, plant, and animal cells. They help in studying eukaryotic genes, producing eukaryotic gene products (like drugs and antibodies), gene therapy, engineering crops, and creating transgenic animals.
b. Shuttle vectors are cloning tools that can replicate in more than one host organism. They are used to transfer DNA into organisms other than E. coli. The vector is first used in E. coli for cloning, then moved into another organism without needing further cloning.
c. A yeast shuttle vector should have markers for selecting both yeast and E. coli cells (like URA3 for yeast and ampicillin resistance for E. coli), unique restriction enzyme sites for cloning DNA, and sequences that allow it to replicate in both yeast and bacteria if it is not integrated into the yeast chromosome.
10.2 Phage vectors used for cloning kill the host bacterial cell in which they are propagated. How can this be advantageous for working with DNA clones? What advantages do phage vectors have over plasmid vectors?
Phage vectors grow using a lytic cycle, meaning they reproduce inside a bacterial cell and then burst it open to release new phage. When a single phage infects a bacterial cell in an agar layer, the released phage infect neighboring cells, leading to multiple cycles of infection and cell lysis. This forms a plaque, a clear area where all the cells have been killed by the phage.
Plaques are useful in DNA cloning because each plaque contains many identical phages, allowing for the collection and further work with the cloned DNA. Phage libraries can be screened similarly to plasmid libraries, but they have two key advantages: phages can hold larger DNA inserts, and more plaques can fit on a plate than bacterial colonies. This means phage vectors allow for screening larger numbers of clones and more DNA inserts on a single plate.
10.3 What is a cDNA library, and from what cellular material is it derived? How is a cDNA library used in cloning particular genes?
A complementary DNA (cDNA) is a DNA copy of mRNA. A cDNA library is a collection of clones containing cDNAs synthesized from the mRNA of a particular tissue or cell. Here's how it's made:
mRNAs are isolated from the tissue or cell.
The mRNA is purified by binding to an oligo-dT column, which attaches to the poly(A) tails of mRNAs, separating them from other RNAs.
The mRNA is reverse-transcribed into single-stranded DNA using a primer and reverse transcriptase.
The mRNA is partially degraded, leaving single-stranded cDNA.
DNA polymerase I makes the second strand of cDNA, and the two strands are ligated together.
Linkers or adapters are added to the cDNA ends, which are then inserted into a cloning vector.
The resulting cDNA library represents the mRNA population from that tissue or cell. The library can be screened using probes or an expression screen, where clones are identified based on the proteins they produce. If the cDNA library is in an expression vector, the cDNA can be transcribed and translated into protein in a bacterial cell. If an antibody to the protein is available, it can be used to find clones expressing the protein.
10.4 Suppose you have cloned a eukaryotic cDNA and want to express the protein it encodes in E. coli. What type of vector would you use, and what features must this vector have? How would this vector need to be modified to express the protein in a mammalian tissue culture cell?
Use an expression vector to ensure that DNA inserts are transcribed and translated. In prokaryotes, the vector should have a prokaryotic promoter before the cDNA insertion site and possibly a terminator sequence after it. In eukaryotes, a eukaryotic promoter is needed, along with a poly(A) site downstream. If the cDNA lacks a start codon, a start codon (AUG) in a Kozak sequence should be added before the cDNA insertion site to ensure proper translation. Care must be taken to align the cDNA’s open reading frame (ORF) with the start codon in the vector.
*10.5 Suppose you wanted to produce human insulin (a peptide hormone) by cloning. Assume that you could do this by inserting the human insulin gene into a bacterial host where, given the appropriate conditions, the human gene would be transcribed and then translated into human insulin. Which would be better to use as your source of the gene: human genomic insulin DNA or a cDNA copy of this gene? Explain your choice.
It would be preferable to use CDNA. Human genomic DNA contains introns, while cDNA synthesized from cytoplasmic poly(A)+ mRNA does not. Prokaryotes do not process eukaryotic precursor mRNAs having intron sequences, so genomic clones will not give appropriate translation products. Since cDNA is a complementary copy of a functional mRNA molecule, the mRNA transcript will be functional, and when translated human (pro-)insulin will be synthesized.
10.6 You have inserted human insulin cDNA in the cloning vector pBluescript II (described in Figure 8.4, p. 176) and transformed the clone into E. coli, but insulinwas not expressed. Propose several hypotheses to explain why not.
If genomic DNA had been used, there might be concerns about introns not being removed because E. coli doesn't process RNAs like eukaryotic cells. However, cDNA is a copy of mature mRNA, so this isn't a problem. Some potential concerns include:
Reading Frame: Ensure the insulin sequence is inserted in the correct reading frame to prevent premature translation termination and ensure the correct protein is made.
Fusion Protein: If the insulin gene is inserted into the polylinker of the pBluescript II vector in the correct reading frame, it could create a fusion protein with β-galactosidase, making it larger than insulin. If a fusion protein is acceptable, ensure only the insulin gene's open reading frame is inserted.
mRNA Issues: If a full insulin mRNA transcript was used instead of just the open reading frame, it might have 5'- and 3'-UTR sequences that could interfere with prokaryotic translation, as it lacks features like the Shine-Dalgarno sequence.
Posttranslational Processing: Insulin produced in E. coli might not undergo the same posttranslational modifications that occur in eukaryotic cells. Depending on the modification, the cDNA might need to be engineered to produce a similar protein without requiring E. coli processing.
10.9 Explain how gel electrophoresis can be used to determine the sizes of the fragments produced by a restriction digest or the size of a PCR product.
In gel electrophoresis, DNA fragments are separated by size. DNA is negatively charged, so it moves toward the positive pole in an electric field. Smaller DNA fragments move faster through the agarose gel than larger ones because they pass through the gel's pores more easily. DNA is separated by size, not charge, because larger fragments, though having more total charge, have the same charge density as smaller ones.
To determine DNA fragment sizes, load a sample and a size standard (marker) into separate wells. Apply an electric current, and after electrophoresis, stain the gel with a dye like ethidium bromide, which fluoresces under UV light. Measure how far the DNA bands have migrated. Using the known sizes of the marker fragments, draw a calibration curve relating size to migration distance. Compare the migration distance of your sample to the curve to determine its size.
10.29 During Southern blot analysis, DNA is separated by size using gel electrophoresis, and then transferred to a membrane filter. Before it is transferred, the gel is soaked in an alkaline solution to denature the doublestranded DNA, and then neutralized. Why is it important to denature the double-stranded DNA? (Hint: Consider how the membrane will be probed.)
In gel electrophoresis, DNA fragments are separated by size. DNA is negatively charged, so it moves toward the positive pole in an electric field. Smaller DNA fragments move faster through the agarose gel than larger ones because they pass through the gel's pores more easily. DNA is separated by size, not charge, because larger fragments, though having more total charge, have the same charge density as smaller ones.
To determine DNA fragment sizes, load a sample and a size standard (marker) into separate wells. Apply an electric current, and after electrophoresis, stain the gel with a dye like ethidium bromide, which fluoresces under UV light. Measure how far the DNA bands have migrated. Using the known sizes of the marker fragments, draw a calibration curve relating size to migration distance. Compare the migration distance of your sample to the curve to determine its size.