1/13
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
logical bases for bioinformatics
all living things are ultimately related to a common ancestor
two similar genes from different species may be because the two species are closely related and not enough time has passed for enough random changes to occur to make the genes different
two similar genes from different species may be because the sequence is constrained to perform an important function (conserved), and natural selection disfavors changes to it because they will generally be deleterious
finding protein-coding genes in the genome
human protein-coding genes have an average of 9 introns
many exons are very short and their ORFs cannot be distinguished from those that would arise by chance
a computer can generate a possible ORFs from a segment of the genome and compare them to protein sequence databases, if several ORFs from the same region show similarity to the same protein, then they probably constitute exons of the same gene
deducing functions of protein-coding genes from their sequence
functional domains of proteins usually have conserved sequence motifs
databases of motifs and their consensus sequence motifs that facilitate such analyses
because of the degeneracy of the genetic code, similarity searches with predicted protein sequences work better than those with DNA sequences
analysis of full genomes
basic cellular toolkit shared by each organism is strikingly conserved
genes required for cellular metabolism make up a large proportion of the total number of genes
transcription and translation related genes are also present in significant number
vast majority of genes identified are considered to be of unknown function
modifications to yeast genome
disruption construct is introduced into diploid yeast cells to replace the appropriate region
presence of the dominant selectable marker confers drug resistance (G-418) so the cells can grow on the drug
when allowed to sporulate, the haploid progeny will either have a wild type chromosome or a recombinant chromosome
effects of gene replacement can then be assessed (viability, growth rate)
functional genomic analysis on C. elegans
researchers “knocked-down” every predicted transcription unit on chromosome 1 by using feeding RNAi
339 were assigned some function as determined by the visible RNAi phenotype
genes that cause embryonic lethality and sterility were often involved in basal cellular function
interacting protein network
model of interactions between proteins, centered around a hub protein
modular nature of transcription factors
if proteins A and B interact, the two fusion proteins (A-DNA binding domain, B-transactivational domain) may be brought into proximity to reconstitute a functional transcription factor
can activate a selectable marker or a reporter gene that is driven by a control element like a UAS in yeast cells
types of protein fragment complementation
reconstitution of functional protein by bringing N-terminal and C-terminal together if bait and prey interact (N and C-terminals are fused to bait and prey proteins)
reconstitution of enzyme activity (DHFR) by bringing them in proximity due to bait-prey interactions
fluorescence due to complete GFP protein from N and C-terminals fused to interacting bait and prey proteins
proteomic analysis
study of all, or a large subset, of proteins in a biological system
encompasses systematic study of the amounts, modifications, interactions, localization, and functions of large sets of proteins at the whole-organism, tissue, cellular, or sub-cellular levels
LC-MS/MS (liquid chromatography, mass spectrometry)
complex mixture of proteins is digested with a protease, and the many resulting fragments are separated by LC into multiple less complex fractions
peptides in each fraction get ionized, and each peptide gets analyzed for is mass and sequence
using protein sequence databases, computational methods identify the proteins in the original sample
proximity-dependent labeling
used to identify proteins that are physically near a protein of interest
gene encoding the protein of interest fused to a specialized enzyme is introduced into cells
when the cells are provided with the substrate for the enzyme, it is converted to a highly reactive chemical that adds a label to any protein in its immediate vicinity (within 10-20 nm)
proteins then get extracted and the labeled ones get identified by LC-MS/MS
bioID
using proximity labelling to tag the proteins in your neighbourhood
mutation is introduced into expression vector and proteins in proximity of fused protein-containing cells are marked/tagged with biotin
new variations of bioID
APEX and APEX2
turboID