1/26
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
what is a sequence alignment
same set of sequences with zero or more gaps inserted into them, such that
all sequences have the same length
there is no alignment position where every sequence contains a gap
what is a pairwise alignment
find the optimal alignment for any pair of biological sequences
what is a global alignment
aligns whole sequences end to end
give an example of a global alignment
needle
implementation of Needleman-Wunsch alignment
what is local alignment
focusses on the best matching part of sequences
give an example of local alignment
water
implementation of Smith-Waterman alignment
when does pairwise alignment not scale
when comparing many sequences
what is the equation to find the comparisons in a multiple alignment
3 sequences = 6 comparisons
8 sequences = 28 comparisons

what is a multiple sequence alignment viewer
JalView
constructing alignments by eye does not scale, though…
there are tools that enable this (JalView, Seaview, CINEMA)
what is an algorithm
a process or set of rules followed to solve a problem
what approaches are there to solve multiple sequence alignment
heuristic approaches
whats a heuristic approach
a pragmatic approach that is not optimal, but is good enough
what are the 2 main classes of multiple sequence alignment algorithms
progressive and iterative
when was clustal multiple sequence alignment
old method, lots of versions (V 1992, W 1994, X 1997, Omega 2011)
when was T-coffee multiple sequence alignment
released in 2000
progressive alignment followed by optimisation
when was MAFFT multiple sequence alignment
released in 2002
progressive alignment with iterative refinement
when was Muscle multiple sequence alignment
released in 2004, draft alignment, improved alignment, refinement
what are the 3 steps of Clustal algorithm of multiple sequence alignment
compare sequences to obtain a similarity matrix
make a guide tree that relates all the sequences
perform progressive alignments, adding new sequences according to the guide tree
how do we compare sequences to obtain a similarity matrix in Clustal multiple sequence alignment
long vector alignment
clustering using a standard algorithm
genetic distances between each pair of sequences are computed- the number of mismatched positions in the MSA divided by the total number of matched positions
how do we compute genetic distances
the number of mismatched positions in the MSA divided by the total number of matched positions
how do we use a similarity matrix to make a guide tree that relates all the sequences in Clustal multiple sequence alignment
the genetic distances from the alignments are used to form a phylogenetic tree for the sequences
the tree is termed a ‘guide tree’ and it is used to control the order in which sequences are added progressively to the multiple alignment
the relative contributions of sequences to the alignment are weighted according to the evolutionary positions of the sequences in the tree
how do we perform progressive alignments, adding new sequences according to the guide tree
sequences are aligned progressively
most closely related pair or pairs are aligned first
the next closely related sequences are then added by aligning them with the existing alignment
describe how Clustal alignments, a progressive alignment strategy, would be performed
4 with 5 → alignment 4-5
1 with 2 → alignment 1-2
alignment 4-5 with 3 → alignment 3-4-5
alignment 3-4-5 with alignment 1-2 → alignment 1-2-3-4-5
what does MAFFT mean
multiple alignment using fast fourier transform
what is MAFFT
a flexible alignment method that constructs a progressive alignment that improves iteratively → i.e. a hybrid between the two main approaches
how does MAFFT add new protein sequences to the alignment
uses a substitution matrix