1/179
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Human Genome Project
an accurate sequence of the human genome
general steps behind genome sequencing
fragment the genome
clone the DNA fragments
sequence the DNA fragments
reconstruct the genome sequence from the fragments
restriction enzymes fragment the
genome at specific sites
each restriction enzyme recognizes a
specific sequence of bases anywhere within. the genome
how long are recognition sites for restriction enzymes usually and are often?
4-8 bp and are often palindromic (sequences of each strand identical when read 5’-3’)
restriction enzyme cuts
the sugar-phosphate backbones of both strands, generating restriction fragments.
Each enzyme cuts at the same place relative to its recognition sequence
restriction enzymes where identified initially in
bacteria
blunt ends from restriction enzymes
cuts are straight through both DNA strands at the line of symmetry
sticky ends from restriction enzymes
cuts are displaced equally on either side of line of symmetry
ends have either 5’ overhangs or 3’ overhangs
different restriction enzymes produce fragments of
different length
general formula for fragment length is
4^n (n=number of bases in the recognition site) with the assumption that the four bases are randomly distributed in the genome
in the discovery of restriction enzymes: the bacterium prevents phage replication (restriction) which happens because
the bacterial restriction endonucleases digest viral DNA into fragments, restricting the biological activity of the virus —> restriction enzymes
restriction and modification together are called
restriction-modification systems - a prokaryotic “immune system” against invaders
some experiments require random cutting of the DNA, meaning that
mechanical forces break phosphodiester bonds randomly and the resulting ends can be blunt or may have protruding single-stranded regions
molecular cloning
isolate a specific DNA fragment from all other fragments, making many identical copies
two basic steps of molecular cloning
insert DNA fragments into cloning vectors that ensure transport, replication, and purification of DNA inserts
insert recombinant DNA into living cells to be copied
three main features of a cloning vector (DNA fragments by themselves cannot replicate so they have to be inserted into a cloning vector)
an origin of replication
a selectable marker gene (antibiotic resistance)
a polylinker = region with several restriction enzyme sites
vector and DNA fragment (insert) digested with the same restriction enzyme generates
complementary sticky ends even if DNA comes from different organisms (bacterial and human)
DNA ligase
used to seal the phosphodiester backbones between vector and fragment
plasmids
double-stranded DNA
replicate in bacterial cells
independent of the chromosome
usually small inserts (<20 kb)
artificial chromosomes
larger
bacterial artificial chromosomes (BACs), inserts up to 300 kb
yeast artificial chromosomes (YACs), inserts up to 2 Mb
each genomic DNA fragment can form a
different recombinant molecule
transformation
the process by which a cell or organism takes up foreign DNA (chemical transformation or electroporation)
genomic library
long-lived collection of cellular clones, contains copies of every sequence in the whole genome inserted into a suitable vector
each colony on the agar plate contains a
different recombinant plasmid each with a part of the human genome (or the genome of interest)
a “perfect” genomic library has
one copy of every sequence in the entire genome
genomic equivalent
number of clones in a perfect library
= length of genome / average size of inserts
problems with cloning
never 100% efficient
some fragments can be cloned more than once
in reality we need 4-5 genomic equivalents
sanger developed a method to sequence DNA based on knowledge of how
DNA replicates in cells
sanger sequencing generates a series of
single-stranded DNA fragments
the size of a genome/the number of genes do not reflect the
complexity of an organism
sequencing is challenging especially for large genomes
whole-genome shotgun sequencing (Celera)
create a genomic library of overlapping fragments (500-1000 bp) in plasmid vectors
sequence randomly chosen plasmids (“shotgun”)
assemble sequences into contigs - continuous base pair sequences
stop sequencing when the 24 chromosomes are covered
the whole-genome shotgun strategy for genome sequencing is a problem because of
transposable elements
thousands of copies and each repeat copy is longer than a sequence read
one cannot determine (human or computer) which unique sequences flanking a repeat belong together
paired-end sequencing
to go around the problems with whole-genome shotgun strategy
a BAC clone library
2 sequence reads for each insert, one from each of 2 primers
some paired reads will include unique sequences on both sides of a repeat
annotation
the process of determining which sequences do which tasks
a key aspect of the Human Genome Project
a way to find protein-coding exons is by
looking for open reading-frames (ORFs) (a reading-frame uninterrupted by stop codons)
DNA can be read in six reading frames, three from each strand due to codone length
if a reading frame is >21 triplets, it is likely it encodes a
real protein
Darwin proposed a
descent with modification
the evolution of species from (extinct) ancestors
how do we trace changes in a genome?
finding conserved sequences
calculate the probability that another 50 bp DNA is 100% identical by chance
probability is obtained by raising 0.25 to the 50th power
a DNA sequence is a homolog of another DNA sequence if the
two derived from a common ancestor
when homology is present across species
the sequence is conserved
RNA is difficult to _________ and some RNAs are
sequence, low-abundance
cDNA (complementary DNA)
we can copy RNA into DNA then sequence that DNA
converting RNA transcripts to cDNA process:
isolate RNA from the cell
mRNAs sorted from rest of RNA based on poly A tails at 3’ end of mRNA
in vitro synthesis (DNA synthesis is primed, polymerization of first cDNA strand from 3’ end of the mRNA)
mRNA is digested with RNase (3’ end of cDNA folds back and acts as a primer for 2nd strand synthesis)
in the presence of dNTPs and DNA polymerase, the first cDNA strand acts as a template to synthesize the second cDNA strand
cDNA library includes only ____ and tells us?
includes only exons
tells us what was transcribed in that cell
alternative splicing means a single gene can ____________________ which?
a single gene can produce different proteins
complicates the prediction of the proteome
important to sequence many individual cDNA clones from libraries made using mRNAs from different tissues
the part of the genome corresponding to exons is the
exome (1.5-2% of the genome)
the increase in genome size correlates with the increase of
noncoding and repetitive sequences in more complex multicellular organisms
an increase in the number of epigenetic mechanisms
two types of particular DNA sequences found many times in genome
multicopy tandem repeats (CAG repeats)
transposable elements (sequences that can move around the genome)
most repetitive DNA does not have
known functions
repetitive DNA has important functions in several contexts:
telomeres - prevent the shortening of chromosomal ends
centromeres - bind proteins that help chromosomal sorting in mitosis/meiosis
gene deserts
regions of the genome that contain few or no genes
exons often encode protein domains which are
sequence of amino acids that fold into functional units
exon shuffling
generates new genes and new proteins
mediated by transposons
during meiotic crossover
gene families
groups genes closely related in sequence and function
abundant through genomes
gene families evolved by
duplication and divergence
duplicated DNA sequence products start out identical but eventualy
diverge via acucmulation of mutations
orthologous genes
arose from the same gene in the common ancestor
usually retain the same function
paralogous genes
arise by duplication
often refers to members of a gene family
pseudogenes
sequences that look alike, but do not function as, genes
rapidly accumulate mutations faster than coding or regulatory sequences
eventually it may not be possible to recognize the gene they derives from
de novo genes
genes without homologs
many genes have no homologs/only have homologs in closely related species
the more we go back in time, the more
chromosomal rearrangements
the average size of syntenic blocks gets smaller with increasing
evolutionary distance
combinatorial amplification
combining a set of basic elements in many different ways
combinatorial amplification at the DNA level results in
greater complexity from fewer genes
combinatorial strategies at the RNA level may lead to
gene amplification and diversity
problems with the reference human genomes
the reference genome is not a “healthy” genome and is also not the most common genome
individuals who provided DNA may have been carriers of disease alleles
the 1,000 genomes project reveals that there is a
high degree of copy number variation and structure variation among healthy individuals
what were the 8 isolates that was sequenced from the streptococcus group B
a “core genome” - ~80% of each genome
a “variable genome” - not present in all strains
the order of hemoglobin genes on the chromosomes reflects the
timing of expression
chromosomal organization
genes placed in the order of their expression during development
genes oriented in same direction on the chromosome
locus control region (LCR)
controls the sequential gene expression by interacting with enhancers and transcription factors
hereditary persistence of fetal hemoglobin
rare condition
should be lethal
deletion of delta and beta genes
gamma genes continue to be expressed during adulthood
results in near normal level of health
hemolytic anemias
some mutations change amino acids in alpha or beta-globin and change their 3D structure
causes destruction of RBCs
examples: sickle cell anemia, other anemias
thalassemias
mutations reduce/eliminate production of one of the two globin polypeptides
associated with alpha-globin deletions
a range of phenotypes
alpha-globin
required during fetal and adult development
alpha-thalassemias detrimental in utero
beta-globin
required after birth
beta-thalassemias detrimental after birth
transgenic organism
has a gene from another individual in the same species/a different species
transgene
the gene introduced into a transgenic organism
several strategies to introduce the DNA
inject the DNA into the cell
a viral particle carries the DNA into the cell
temporarily disrupt the plasma membrane
transgenes can be integrated into
the chromosome, the plasmid
if transgenes are transmitted across generations it has to be integrated into
gametes
a fertilized egg contains 2
pronuclei (maternal pronucleus and paternal pronucleus)
pronuclear injection is used to create a (and what is the process)
transgenic mouse (fertilized eggs harvested from the female mouse, linear DNA (transgene) is injected into one of the pronuclei, fertilized egg inserted into the reproductive tract of pseudo-pregnant mice)
25-50% of the time, integration is into a
random genomic location
if insertion of transgene happened before the first mitotic division,
every cell has the insert
if insertion of the transgene happened after the first cell division, the transgene will be in
some cells and not others (mosaic)
mice are bred to produce stable
transgenic lines
the P element (and how are transgenic flies made using P element transformation)
a type of transposable element in Drosophila
used as a vector for gene transfer
contains a transgene and a visible marker between inverted repeats
a helper plasmid contains the transposase gene
the two plasmids are injected into embryos
transgenic flies are recognized because they have red eyes
the resulting adult flies are mosaic
the visible marker helps identify transgenic flies in subsequent crosses
postranslational modifications are absent/present in
absent in bacteria
present in mammalian cells
gaucher disease
autosomal recessive
glucocerebrosidase deficiency in lysosomes
glucosylceramide accumulates in cells
the first plant-made drug approved by the FDA
glucocerebrosidase
pharming
farming and pharmaceutical
the use of transgenic animals/plants to produce protein drugs
blood coagulation
thrombin cleaves fibrinogen
fibrinogen helps form a blood clot
antithrombin 3
prevents thrombin inadvertent activation
atryn
blood factor antithrombin III
produced in goats, used to treat blood clots
antithrombin III deficiency
congenital disease
autosomal dominant but can also be recessive
predisposition to blood clots
reproductive cloning
the genome of a somatic cell becomes the genome of every somatic cell in a different individual
somatic cell nuclear transfer (SCNT)
used to create reproductive clones
a diploid nucleus of a somatic cell from one individual inserted into an egg cell whose nucleus has been removed
EPSPS (5-enolpyruvylshikimate-3-phosphate synthase) is needed by what to make what
plants need it to make aromatic amino acids