1/68
Genomics & Genome Organization
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
genomics
Chapter 21 focus: mostly about ____________________________.
genomes; genes; DNA; grouped
Genomics = study of:
○ The structure of ____________________________
○ What ____________________________ are in them
○ What other types of ____________________________ are in them
○ How they are ____________________________ and organized
bioinformatics
Genomics is a part of ____________________________.
data; sequences
Bioinformatics = a larger field involving:
○ Processing large ____________________________ sets
○ Includes things like genome ____________________________
proteome; transcriptome
Bioinformatics can include:
○ Looking at protein sequences → the ____________________________ of an
organism
○ Looking at all mRNA sequences → the ____________________________ of an
organism
seuqnecing
Genomics starts with ____________________________ the DNA.
DNA
Goal: sequence all of the ____________________________ in an organism.
ten; 1990; 2003; human genome; individual; United States; Jim Watson; Francis Collins
Human Genome Project (HGP):
○ Planned as a ~_______-year project.
○ Actually ran from about _______ to _______.
○ Goal: sequence the entire ____________________________
____________________________.
○ Sequenced the genome of a single ____________________________.
○ Started as a ____________________________ government project (NIH).
○ Led initially by ____________________________ (of Watson & Crick) and
____________________________ ____________________________ (still a
head of NIH).
3; 6.4; 6.4; 3.2
Designed to:
○ Cost about $__________ billion over ten years.
○ Sequence all _______ billion bases in the diploid human genome.
■ Diploid genome: _______ billion bases
■ Haploid genome: _______ billion bases
Thomas Hunt Morgan
The HGP strategy started from existing information:
○ Linkage maps / cytogenetic maps from crossing over experiments.
○ Built from work like ____________________________
____________________________ ____________________________
(Drosophila crossing-over experiments).
genes; restriction enzymes
Strategy:
○ Use linkage information (how far apart different
____________________________ are on a chromosome).
○ Divide chromosomes into smaller, overlapping fragments.
○ Use ____________________________ ____________________________
(enzymes that cut at specific DNA sequences) to cut DNA.
300-400; thousand
Fragments must be relatively small because:
○ Early sequencing could read only ~– bases at a time.
○ Now can handle up to a few ____________________________ bases at a time.
○ Fragments are cut with overlap so they can be matched and ordered.
time; labor
This approach was:
○ ______-intensive
○ ____-intensive
1998; Craig Venter
Around _______ (late 1990s), ____________________________
____________________________ (Celera) entered with a private company.
shatter; different; computer
Around _______ (late 1990s), ____________________________
____________________________ (Celera) entered with a private company.
300
Claims:
○ Could do it for about $__________ million (≈ 1/10 the cost of the public project).
○ Could finish before the original Human Genome Project.
parallel; 98-99
Result:
○ Two human genome projects (public NIH vs. private Celera) ran
____________________________.
○ Both ended up with ~–% of the genome sequenced (last few % are very hard).
○ Even scientists could not easily tell which had sequenced more.
Clinton; Collins; Watson; Craig Venter; same
Political resolution:
○ President ____________________________ brought
____________________________, ____________________________, and
____________________________ ____________________________ to the
White House.
○ After a marathon meeting (~24 hours), they held a joint press conference.
○ Announced that both groups had “finished” sequencing the human genome at
the ____________________________ time.
3; 5
Now, thousands of whole-genome sequences exist:
○ Done for many organisms (estimated __________ to __________ thousand
organisms or more).
scientific articles; public; research
PubMed & genome databases:
○ PubMed is used to look up ____________________________
____________________________.
○ A subdivision allows viewing any genome sequence obtained with
____________________________ funding.
○ Many researchers use these sequences in ____________________________
projects.
domains
One application: looking for patterns in sequences that predict folding or _______.
proteins; genome; cow; corn; human; amino acids
Example: WD40 domain:
○ Identified by comparing many related ____________________________ and
their ____________________________ sequences.
○ Proteins from __________, __________, and __________ all show the same
pattern.
○ Certain ____________________________ ____________________________
are spaced in a characteristic way → leads to a common folding pattern (beta
sheets forming spiral structures).
similar; mitosis; peroxisomes
Another application: finding families of proteins:
○ Thousands of proteins can be compared and clustered.
○ Color clusters indicate proteins with ____________________________
functions.
○ Sequence comparison can identify groups:
■ e.g., proteins involved in ____________________________
■ e.g., proteins involved in ____________________________ (like
peroxisomes)
proteomic; bioinformatics
This type of analysis falls under ____________________________ (study of all
proteins) and is a subset of ____________________________.
gene chips
Another modern use of genomic information: ____________________________
____________________________ (“gene chips”).
sequence; messenger RNAs
If you know:
○ The genomic ____________________________
○ Which parts of the genome produce ____________________________
____________________________ (mRNAs)
printers; mRNA
You can:
○ Use high-resolution ____________________________ to print many short DNA
sequences onto a small × chip.
○ Each spot on the chip corresponds to a specific
____________________________ (or mRNA).
messenger RNA; ssRNA; cell; mRNA
Experiment:
○ Take an unknown cell and isolate its ____________________________
____________________________.
○ Radioactively label the mRNA and wash it over the chip.
○ Single-stranded mRNA will base-pair with complementary
____________________________ or ____________________________ on the
chip.
○ Radioactive spots on the chip show which ____________________________
are being expressed in that cell.
20; 2; Human Genome Project
Notes:
○ Originally extremely expensive (~$__________ thousand per chip).
○ Now much cheaper (~$– thousand).
○ Method is still being refined; has some sloppiness but is getting more precise.
○ It is an outgrowth of the ____________________________
____________________________ ____________________________.
small; eukaryotic; larger
Genome size:
○ Bacteria and Archaea typically have ____________________________
genomes.
○ The largest bacterial/archaeal genome is still smaller than the smallest
____________________________ genome.
○ Eukaryotic genomes are usually much ____________________________
overall.
1500; 7500; more
Gene number:
○ Most bacteria have between about _______ and _______ genes.
○ Eukaryotes usually start around the middle of that range and go up to many
____________________________ genes.
gene density; eukaryotes
Gene density:
○ Increase in gene number is not as large as increase in total genome size.
○ Therefore, eukaryotes have more DNA per gene → lower
____________________________ ____________________________.
○ Much more DNA between genes in ____________________________.
protein; RNA
Modern definition:
○ A gene = stretch of DNA that codes for a ____________________________ or a
specific ____________________________ (e.g., rRNA, tRNA).
proteins; RNA; 1.5
Human genome composition (key result of the HGP):
● Exons (expressed parts of genes):
○ Code for ____________________________ and specific
____________________________.
○ Make up only about _______ % of the genome.
5
Introns (noncoding parts within genes):
○ Spliced out of transcripts.
○ Make up about _______ % of the genome.
repressor; enhancers; 1.5
Regulatory sequences:
○ Include binding sites for ____________________________,
____________________________, and other control proteins.
○ Control the expression of the _______ % of coding DNA.
○ Occupy a large fraction of the genome.
not
Unique noncoding DNA:
○ Does ___ code for proteins or known specific RNAs.
○ Appears only once (no repeats).
58; thousands
Repetitive DNA:
○ More than _______ % of the genome.
○ Noncoding, but repeated many times.
○ Some repeats are long stretches (thousands of bases) repeated many times.
○ Others are short sequences (like 5–10 bases) repeated
____________________________ of times.
DNA (not-genes); eukaryotes; eukaryotes
Overall:
○ ~98% of the genome is ____________________________ (does not code for
specific proteins or RNAs).
○ This huge fraction of noncoding DNA is a feature especially of
____________________________.
○ In bacteria: very little noncoding DNA, few/no introns.
○ In Archaea: a bit more noncoding DNA, but far less than in
____________________________.
transposable elements; transposable elements
Repetitive DNA can be divided into:
○ Related to ____________________________
____________________________
○ Unrelated to ____________________________
____________________________
transposable; jump around; identical twins
ALU elements:
○ A specific type of repetitive DNA in humans.
○ Do not code for proteins.
○ Are ____________________________ elements → they can
“____________________________” in the genome.
○ Change rapidly enough that even ____________________________
____________________________ can sometimes be distinguished by their ALU
patterns (useful in forensic genetics).
44; sequences
Transposable-element–related DNA makes up about _______ % of the genome.
○ These sequences are actively changing over time.
○ They can insert almost anywhere, including occasionally in
____________________________.
Barbara McClintock;
Transposable elements were discovered by ____________________________
____________________________.
corn; coloration
She studied ____________________________ and noticed:
○ Differences in ____________________________ patterns of kernels even on the
same ear.
copied; randomly; RNA; reverse transcriptase; DNA; genome
She hypothesized transposition events:
○ Direct transposition:
■ A part of the genome (a transposon) is
____________________________ and then inserted
____________________________ somewhere else in the genome.
■ If insertion affects a color gene, it changes kernel color in that region.
○ Retrotransposition:
■ A random part of the genome is transcribed into
____________________________ (not necessarily mRNA).
■ ____________________________ ____________________________
(enzyme present in our cells) converts that RNA back into
____________________________.
■ That new DNA copy is then inserted elsewhere in the
____________________________.
genome
In both cases:
○ A copy of some genomic region is inserted into a new
____________________________.
○ This is happening in your genome right now.
insane; Nobel Prize
Initially, her ideas were considered ____________________________, and she had
trouble publishing.
○ Later, others confirmed her work in many organisms, including humans.
○ She eventually won the ____________________________
____________________________ for this discovery.
kitchen suddenly; apoptosis; gene duplication
The genome is therefore dynamic, not perfectly stable:
○ Large pieces of DNA can literally move, like a
“____________________________ ____________________________ jumping
into the living room.”
○ If damage is severe, the cell dies via ____________________________
(programmed cell death).
○ Transposition can also drive evolution through
____________________________ ____________________________.
gene duplication
One major outcome of transposition: ____________________________
____________________________.
ribosomes; hundreds; thousands; transposition; same
Ribosomal RNA (rRNA) genes:
○ rRNAs are long RNAs that form part of ____________________________.
○ Not encoded in just one place; there are ____________________________ to
____________________________ copies of rRNA genes.
○ Organisms need many ribosomes → many copies of rRNA genes.
○ This extra copy number is believed to result from
____________________________ events.
○ In rRNA, duplicated genes remain essentially ____________________________
(all ribosomes are the same).
gloin
Another classic gene family: ____________________________ (hemoglobin genes).
beta; alpha
Adult hemoglobin:
○ Contains an ____________________________ subunit and a
____________________________ subunit.
gene duplication
There are many globin-related genes with slightly different sequences (different Greek
letters).
○ These variants arose by ____________________________
____________________________ and mutation.
42; 37
Sequence homology examples:
○ Alpha vs. beta: about _______ % homologous.
○ Alpha vs. epsilon or others: ~– % homologous.
oxygen; mother’s bloodstream
Functional specialization:
○ Fetal and embryonic hemoglobins bind ____________________________
more strongly than adult hemoglobin.
○ This helps a fetus obtain oxygen from the ____________________________
____________________________.
globin; duplication; mutations; alpha; beta
Evolutionary model:
○ Start with a single ____________________________ gene.
○ Gene ____________________________ creates two copies.
○ Each copy independently accumulates ____________________________.
○ They diverge into the ____________________________ and
____________________________ lineages.
○ Further duplications within each lineage produce multiple specialized globins with
different O2 affinities and roles.
TPA; plasminogen; fiber nekton; epidermal growth factor
Transposition can also move exons around → exon shuffling.
● New genes can be created by combining exons from different ancestral genes:
○ Example: ____________________________ (TPA) gene.
■ Its exons can be traced back to:
■ ____________________________ gene
■ ____________________________ gene
■ ____________________________
____________________________
____________________________ gene
transposition
DNA and protein sequence analysis shows that:
○ Exons from these different genes were brought together by
____________________________ events.
○ Resulted in a new gene encoding a protein with combined functional domains.
2; 12; 13; centromere; telomere
Beyond transposition and point mutations, whole chromosomes can be reorganized
over evolutionary time.
● Example: Human vs. Chimpanzee:
○ Human chromosome _______ appears to be a fusion of chimpanzee
chromosomes _______ and _______.
○ Evidence:
■ Presence of two ____________________________-like regions in
human chromosome 2.
■ Presence of internal ____________________________-like sequences.
16; human
Example: Human vs. Mouse:
○ Human chromosome _______ contains genes found on four different mouse
chromosomes.
○ Indicates that mouse chromosomes were rearranged and fused into a different
pattern in the ____________________________.
longer
These rearrangements:
○ Are not happening all the time in each cell (unlike transposition).
○ Occur on ____________________________ timescales as lineages diverge.
evolutionary trees
Comparative genomics data (sequences, rearrangements) are used to build
____________________________ ____________________________.
5; 65
Examples of divergence estimates:
○ Humans and chimpanzees diverged about _______ million years ago.
○ Humans and mice diverged about _______ million years ago.
globin; Alu
Different parts of the genome evolve at different speeds:
○ Some genes (e.g., ____________________________ genes) are highly
conserved.
○ Others (e.g., regions like ____________________________ elements) evolve
very fast.
slowly; rapidly; dates
Result:
○ Using ______ changing genes gives one estimate.
○ Using _____ changing regions gives slightly different estimates.
○ Overall branching pattern (e.g., chimp closer to human than mouse) does not
change, but the ____ can shift.
anterior; posterior; mouse
● Another major finding: some developmental genes are extremely conserved.
● These include cytoplasmic factors / homeotic genes:
○ Control body plan (which parts become ____________________________ vs.
____________________________).
○ Similar genes control A–P patterning in flies and in other
____________________________ animals.
evolution
These developmental genes are often more conserved than many structural genes (like
hemoglobins).
○ This high conservation affects how we build and interpret
____________________________ trees.
○ Raises questions about which genes to prioritize (highly conserved vs. rapidly
evolving) in analyses.
bacteria; archaea; eukaryotic
The legacy of the Human Genome Project has become the many genomes era:
○ Thousands of genomes sequenced, especially
____________________________ and ____________________________.
○ Hundreds to thousands of ____________________________ genomes have
also been sequenced.
some
These data sets:
○ Reveal genome organization, transposition, and chromosomal rearrangements.
○ Allow reconstruction of ____________________________ relationships and
divergence times among organisms.