1/3
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai | Chat |
|---|
No analytics yet
Send a link to your students to track their progress
Why DNA sequencing?
compare similarities between organisms to see conservation to see which sequences are the more important.
Make phylogentic trees.
Pretty much everything in bioinformatics
Find mutations such as SNPs
Alpha fold
BLAST steps
remove low-complexity regions
make a dictionary of all words that are 3 amino acids long or 11 necleotides long.
augment list / dictionary to include similar words (this will include sequences that are similar but not identitcal)
scan database for occurrences of words
connect nearby occurrences
extend matches in both directions to see how far they align
prune the list of matches using a score threshold. This gets rid of sequences that matched just due to chance.
evaluate significance of each remainging match. This is done by looking at the E value and further removing sequences that matched just due to chance.
perform smith-waterman to get alingment. With the sequences left from the above steps we can run a dynamic programming (smith-waterman) to align them.
GWAS
Comparing SNPs in people with a disease verses people without a disease. You need to have at least 1000 people in both the control and test group to do this. You use SNP chips to compare many SNPs at once not just one.
Alpha Fold
input amino acid sequence
runs template search, MSA, and gets a general idea of the protien shape.
based on the things in the above steps a machine learning algotithm learns relationships between the amino acids.
Models the protien structure
Evaulate peformance with legend and dot plot and Ramachandrin plot (shows areas where we expect amino acids to be)
Structure helps you infer function and visa virsa