1/76
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Studies of the genome must include what?
The transcriptome and proteome
How the transcriptome and proteome establish and coordinate the network of linked biochemical pathways in a living cell
What is the most direct way to study the transcriptome? What are the cons to this?
convert all the mRNA to cDNA, then sequence every clone in the library
The cons are that it is laborious and time-consuming
How do you generally characterize the transcriptome?
identify all the mRNAs present and quantify their abundances
What are two tests that are used to study the transcriptome?
Serial analysis of gene expression (SAGE)
Microarray or DNA chip analysis
How does SAGE address the issue of the labor intensity of characterizing the transcriptome?
Instead of studying the whole cDNA strands, SAGE uses 12bp sequences of the cDNA for identification of the genes coding at that moment
What are the steps for SAGE?
mRNA is immobilized in a chromatography column by annealing the poly(A) tails to oligo (dT) strands that are attached to cellulose beads
mRNA is converted to cDNA and cut is a restriction enzyme (Alu 1) that cut frequently in each cDNA
the terminal restriction fragment remains attached to the bead while the other fragments are eluted
An oligo with a BsmFI recongition sequence is ligated to the free end of each cDNA
BsmFI will cut 10-14 bp downstream of the recognition sequence
The fragments cut off the bead are then collected, ligated, and sequenced. The tagged sequences are compared with the sequences of genes in the genome
What is the difference between a microarray and a DNA chip analysis?
A DNA chip carries an array of immobilized oligos on a wafer or silicon
A microarray us made up of DNA molecules (PCR products) on a glass slide or nylon membrane
What are the two objectives of studying the transcriptome and how does a microarray accomplish them?
Identify the genes who mRNA are present ← all relevant genes must be represented by at least one probe on the array
Determine the relative amounts of various mRNA ← Each position in the microarray contains up to 10^9 copies of the probe molecule
Why are both the 5’ and 3’ end of mRNA on a microarray?
It is to look at the 5’/3’ ratio. If the ratio is significantly different from one than there is likely something wrong with the RNA (or the gene is truncated)
What are the steps of a microarray?
1. RNA Isolation
2. RNA quantitation
3. cDNA synthesis
4. Labelling (kind of included in the cDNA synthesis)
5. Hybridization
6. Washing unhybridized material
7. Scanning (where we obtain the data)
8. Data normalization
What are the two limitations of microarrays and chip analysis?
Insufficient specificity between some of the mRNA can lead to cross-hybridization (this can happen with paralogous genes)
Two or more mRNA may be derived from the same gene via alternative splicing
What happens when saturation happens in a microarray and how do you prevent it?
When saturation happens, it leads to underestimation of genes being expressed. This can be prevented by ensuring there are enough probe spaces on a chip
What is used as a positive control for microarrays?
the actin gene: it will never change expression with changing experimental conditions
When comparing transcriptomes, what are some biases that could look like differences in levels of mRNA but are not? Then how do you prevent these biases?
The amount of target DNA
the efficiency in which the probe has been labelled
the effectiveness of hybridization
These can be avoided with
Using controls like the actin gene
scanning arrays at the same wavelength
Data normalization: reduce background signal
What is a dendrogram?
Any sort of tree diagram
What study system is ideal for researching the transcriptome?
Yeast is an ideal study system, specifically looking at genes for sporulation
What has been studied in the human transcriptome?
transcriptome for different cell types have been mapped onto the genome
Above has provided global view of patterns of gene expression for whole chromosomes
Led to discovery of transcripts from regions of genomes where no genes are known to exist
What was the benefit of the transcriptome analysis of human chromosomes 21 and 22?
10,000 transcribed sequences have been identified from regions not previously thought to contain genes, signifying the role of transcriptome analysis in genome annotation
What are the advantages to studying the proteome?
Proteome plays a central role as the link between genome and biochemical pathways of the cell
Proteome shows how the genome operates and how its dysfunction can lead to diseases
Some proteins undergo physical and/or chemical modifications before becoming functional
What is the study of the proteome called?
proteomics
What are the two methods of identifying proteins in the proteome? (protein profiling)
protein electrophoresis
mass spectrometry (MALDI-TOF)
What kind of gel is used for identifying proteins in the proteome?
Polyacrylamide gel
What chemical is used to separate proteins by their molecular mass and how?
(SDS) Sodium dodecyl sulfate: This denatures the proteins and confers a negative charge about equivalent to the length of the unfolded peptide
What is isoelectric focusing?
These separate proteins using the function between charge and pH until proteins hit the isolelectric point
What is the isoelectric point?
The position of a protein in a gradient where its net charge is zero
How does two-dimensional gel electrophoresis work?
The protein sample is loaded to the top left corner of the gel
First electrophoresis is done to separate proteins by isoelectric focusing
the gel is spun 90 degrees so line of proteins are on top and Na dodecyl sulfate (SDS) is added
A second electrophoresis is ran so proteins are then separated by molecular mass
Protein spots are visualized by staining with a silver solution
What are the steps for matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF)
protein of interest in purified and digested with a sequence-specific protease like trypsin
following ionization, the mass-to-charge ratio of a peptide is deduced from its time-of-flight within the mass spectrometer as it passes the ionization source to the detector
mass-to-charge ratio can be used to calculate molecular mass, allowing the amino acid composition of the peptide to be deduced
The resulting compositional sequence is related to the genome sequence in order to identify the gene that specifies for that protein
How does trypsin work?
Trypsin cuts following arginine or lysine residues to produce peptides 5-75 amino acids in length
How does mass spectrometry work for MALDI-TOV?
Peptides are ionized by a pulse of energy from a laster and then accelerated down the column to the reflector and onto a detector
Time-of-flight is related to its mass-to-charge ratio
The computer contains a database of the predicted molecular masses of every trypsin fragment of every protein encoded by the genome of the organism under study
Computer compares mass of detected peptide with database to identify the most likely source protein
What is the point of isotope coded affinity tag (ICAT) in proteomics?
It is used to analyze two proteomes: Each proteome is labelled with a different fluorescent marker before being put through MALDI-TOF
What is the chemical label commonly used in ICAT?
deuterium
What are the two common methods for identifying proteins that interact with one another?
Phage display
Yeast-two hybrid studies
What is phage display?
Phage display uses bacteriophage lambda or M13
Gene is cloned in so that is fuses with one of the phage coat proteins (via a unique restriction enzyme site located within a gene for a coat protein)
Disadvantages to phage-display?
Time-consuming
Requires prior info on possible interactions
What is a phage display library? and how does it work for testing protein interactions?
A collection of clones displaying a range of proteins
The test protein is immobilized within the well of a microtiter plate
Phage display library is added to plate
plate is washed and all proteins that interact with test protein will remain in the plate
How does yeast two-hybrid screening (Y2H) work?
A gene of a human protein is ligated to the gene for a DNA-binding domain of a yeast activator, creating a construct that is part human and part yeast activator
A variety of fusion proteins for the actual yeast activation domain are created using different human DNA fragments
These two constructs are mixed into the yeast for screening
If there is protein-protein interaction, then RNA polymerase will be activated to express the reporter gene
How do you identify the components of a multiprotein complex?
With affinity chromatography
How does standard affinity chromatography work?
Test protein is attached to the chromatography resin and placed in a column
Cell extract is placed in a low-salt buffer, which causes formation of hydrogen bonds that hold proteins together in a complex
Proteins that interact with the test protein are retained in the column
Interacting proteins are then eluted with a high-salt buffer
What are two disadvantages to affinity chromatography? (one minor, one major)
The need to purify the test protein is time-consuming and difficult to use in large screening programs
Major: a single member of the protein complex is used as bait, so a member of the complex may not be isolated if it does not interact with the bait
How does Tandem-affinity Purification (TAP) work?
Similar to affinity chromatography
test protein is modified so that c-terminal extension binds to calmodulin
The resin in the column contains the modified test proteins
Modified test proteins trap protein complex
How are proteins identified in affinity chromatography and TAP?
Through mass spectrometry
What is a metabolome?
The complete collection of metabolites present in a cell or tissue under a particular set of conditions
What is metabolomics? (aka biochemical profiling)
Gives a precise description of the biochemistry underlying different physiological states (Ex. disease states)
What is a phylogeny?
A classification scheme indicating similarities among species and their evolutionary relationship
How are evolutionary relationships among a group of organisms illustrated?
With a phylogenetic tree
What are the main two portions of a phylogenetic tree?
nodes and branches
What are internal versus external nodes?
internal nodes represent taxonomic units such as species or genes
External nodes (at the end of branches) represent living organisms
What are two examples of what branch length can represent in a phylogenetic tree?
It can represent elapsed time
It can represent the number of molecular changes (aka mutations)
What are operational taxnomic units (OTUs)?
The extant (aka not extinct) taxonomic units under comparison
What are hypothetical taxonomic units (HTUs)?
Extant species whose relations are not based on empirical data
What is the difference between a rooted and unrooted tree?
Rooted trees have a root (aka most recent common ancestor) of all taxonomic units under study and each path corresponds to evolutionary time
An unrooted tree simply specifies the degree of kinship among taxonomic units
What is the different between a scaled and unscaled tree?
A scaled tree uses branch length proportionally to degree of relation, while an unscaled degree does not
What is the topology of the phylogenetic tree?
This refers to the branching pattern of a phylogenetic tree
Does a DNA or protein tree yield more phylogenetic information and why?
DNA yields more phylogenetic information than proteins
With the redundancy of codons, different DNA segments lead to the same amino acids, so for example. while three organisms may have the same amino acid sequences (which would show they are equally related on a tree), two may have the same DNA sequence, while one has a different DNA sequence, showing that two organisms are more related than the third one
What is an outgroup?
An OTU which we have external knowledge the clearly show them to have diverged from the common ancestor prior to all the other OTUs (ingroup taxa)
What are the differences between monophyletic, clade, paraphyletic, and polyphyletic? Can you show it in a diagram?
Monophyletic - two or more organisms or DNA sequences that are derived from a single ancestor
Clade - group of monophyletic organisms
Paraphyletic - a group of sequences or taxa that exclude some members of a clade
Polyphyletic - a group of DNA sequences derived from two or more distinct ancestral sequences
What is a gene tree?
Gene tree is constructed from comparting the sequences of orthologous genes and used to make inferences about the evolutionary history of the species from which the genes were obtained
Is a gene tree the same as a species tree?
While a gene tree is a more accurate reflection of a species tree than a tree based on morphological data, it does not mean that a gene tree is always the same as a species tree
What is a molecular clock?
It is used to date the time at which a gene divergence took place
With mutation not occurring simultaneously with speciation events, the molecular clock cannot give an accurate time of speciation
What is random genetic drift?
The change in allele frequency in a population
What are the two types of molecular phylogenetic data?
Character data and distance data
What is character data?
A well-defined feature that in a taxonomic unit can assume one out of two or more mutually exclusive character states
What is a character state?
The value or the character in a particular OTU
What is the different between a quantitative or qualitative character?
quantitative characters show a normal distribution and are continuous
qualitative characters are discrete, meaning they are either one thing or another and nothing in between
HYPOTHETICAL: We are looking at the length of middle fingers between male and female gorillas. We find that female fingers are 8 inches while male fingers are 9 inches. Can you tell me what is the character, character states, and OTU?
Character: finger length
Character state: 8 and 9 inches
OTU: gorillas
What is distance data?
A quantitative statement concerning the dissimilarity between two OTUs
What is the degree of divergence?
It is the proportion of differences between two organisms or DNA sequences. It is calculated with the number of nucleotide differences (n) divided by the sequence length (N), multiplied by 100
Looking at a sequence from Organism 1 and Organism 2, can you calculate the degree of divergence between them? What can you conclude from the degree of divergence?
n= 3 N= 19: (3/19)*100 = 15.79%
The degree of divergence is pretty low, indicating that they are closely related
What are the two major types of phylogenetics trees?
A cladogram and a phenogram
What is the difference between cladistics and phenetics?
Cladistics: a tree that expresses ancestor-descendent relationship (rooted)
Phenetics: a tree that establishes the relationship among a group of organisms on the basis of the degree of similarity (unrooted)
What are four methods of tree reconstruction?
maximum parsimony
UPGMA
transformed distance method
neighbor-joining method
Unweighted pair-group method with arithmetic means (UPGMA)
UPGMA employs a sequential clustering algorithm, in which local topological relationships are identified in order of decreasing similarity, and the phylogenetic tree is built in a step-wise manner
Simple terms, Identify two OTUs that are the most similar. Then from that pair (aka composite OTU) the next most similar OTU is paired up with the composite, and repeat
How do you calculate the distance matrix between the composite OTU of A and C (d(AC)) and D? AKA how do you calculate d(AC)D? BONUS How do you calculate d(ACD)B?
d(AC)D= (dAD+dCD)/2
BONUS
d(ACD)B=(dAB+dCB+dDB)/3
What is an assumption you are making when calculating distance matrix?
That multiple substitutions did not occur
Steps of the neighbor-joining method
Start with a star- pattern tree where all OTUs stem from same ancestor
Pick a pair at random
Calculate branch length for this new pair
Repeat with all possible pair combinations
Final tree will have the shortest total branch length
What is the maximum parsimony method
The identification of tree topology that requires the smallest number of evolutionary changes to explain the observed differences among the OTUs under study
What is the bootstrap method
A computational technique used as a means to estimate the confidence level of phylogenetic hypotheses using a null hypothesis and a series of pseudosamples