1/39
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
what are two problems that standard BLAST cannot solve?
it cannot find homologs that are too distantly related
it’s not built for (too detailed for) large queries (10,000 bp)
what does PSI BLAST stand for?
position specific iterated BLAST
what is the purpose of PSI BLAST?
used to detect weak but biologically meaningful relationships between proteins
What are the steps of PSI BLAST?
select a query and search it against a protein database
PSI-BLAST constructs a multiple sequence alignment then created a specialized position specific scoring matrix (PSSM)
the PSSM is used as a query against the database
PSI-BLAST estimates the statistical significance
repeat step 3 and 4 iteratively, typically five times. At each new search, a new profile is used as the query.
What does PSSM stand for?
position-specific scoring matrix
what are the steps to calculate the PSSM?
create a multiple sequence alignment
calculate raw frequencies - at each position the computer calculates how many times each of the 20 amino acids appear at each position
calculate overall frequency - divide the specific frequency by the overall frequency
log value of frequencies - find the log base 2 of all the values
what are and how to interpret the results of PSI-BLAST?
iteration - the number of BLAST you are on
number of hits
number of hits > threshold - how many hits were significant
what does it mean to approach convergence?
as you move down the iterations, the rate of growth slows down, till no more new sequences are found
What is corruption?
presence of at least one false positive alignment with an e value < 10^-4 after 5 iterations
what are approaches to stopping corruption?
apply filtering of biased composition regions
adjust E value from 0.001 to a lower E value such as E = 0.0001
visually inspect the output from each iteration. remove suspicious hits by unchecking the box.
What is the result of one false positive?
one false positive can be amplified to many and it becomes permanent due to the nature of the math for the PSSM.
why proteomics?
-proteins are the functional (almost every diseases is the result of a protein failing not the gene)
-transcriptome information is only loosely related to protein levels
what does ELISA stand for?
enzyme linked immunosorbent assays
what is ELISA used to do?
used to detect and measure specific proteins or antibodies in a liquid sample.
steps of ELISA
1 - antibody is added; only binds to target protein
2 - secondary antibody is added; it looks for first antibody
3 - substrate is added
4 - enzyme catalyzes; produces color; the more color the more protein
steps of single protein analysis
1 - gel based separation: each black dot is a different protein
2 - spot excision: cut out one specific dot; same dot, same protein
3 - digestion: add on enzyme to chop up protein into smaller pieces called peptides
4 - MS analysis
what does protein sequence analysis allow for?
protein classifcation
What is the primary purpose of analyzing protein sequences using computational (in silico) methods?
characterize protein structures in silico and allows the prediction of protein structure and function
If a BLAST search shows high sequence homology between two proteins, what can you safely assume about their similarities?
may not have the same function but most always has the same structural fold
steps of shotgun analysis
1 - digestion of protein mixture
2 - liquid chromatography
3 - MS analysis
what are protein sequence databases?
-atlas of protein sequence and structure (currently known as Protein Information Resource (PIR)
-protein data bank
-UniProt
What can patterns of conservation in protein sequences tell us about specific amino acid residues?
They help determine which residues are under selective constraints, meaning those residues are critical for the protein's function
What is the defining characteristic of homologous proteins?
They share a common ancestor.
Different proteins evolve at ____ depending on their functional importance and structural requirements.
different
protein analysis is ____ sensitive than DNA analysis
more
How is the amino acid sequence of a protein typically generated for comparison?
It is generated from proteomics experiments
How is the % similarity between two protein sequences determined?
By aligning the two sequences and counting the number of identical residues or using an index of similarity (like a substitution matrix)
What is the primary method used to compare the 3D shapes of different proteins?
Superimposition (or structural overlay). This involves physically rotating and shifting the 3D models to see how well their "skeletons" match up.
pairwise alignment
multiple sequence alignment