1/21
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Advantages for Distance Methods
fast and computationally efficient
suitable for large datasets
useful for upper bounds in branch and bound search
perform well when phylogenetic signal is strong
wide range of evolutionary models available
long history of use
Disadvantages of Distance Methods
loss of information as it gives all as one value
cannot identify which sites are informative
difficult to incorporate indels
can produce unintuitive results
error increases with difference
Two step approach for distance matrix
calculate pairwie evolutionary distances between all taxa
transform the distance matrix into a tree with branch length
What happens if distances are additive?
each pairwise distance equals the sum of branch lengths connecting them
What happens if datasets are real
data are often non additive due to rate variation, multiple hits and noise
methods must therefore harmonise differences
How do we find the tree that best fits the observed distances?
Neighbour joining
Least squares
minimum evolution
Why should we never use UPGMA?
assumes strict molecular clock
produces ultrametric trees
extremely sensitive to rate variation
Neighbour Joining
Works by
identifying pairs of taxa that minimise total tree length
treating joined taxa as a single composite taxon
Features
does not assume a molecular clock
produces an unrooted tree
deterministic
progressive clustering
Step by Step for NJ
Start with a star tree from the distance matrix
calculate total tree length
evaluate all possible taxon pairs
select the pair that minimises total tree length
join the pair into a new internal node
calculate branch lengths to X
recalculate distances from X to all remaining taxa
reduce the matrix size by one
repeat until the tree is fully resolved
Properties of NJ
returns a single unrooted tree with branch lengths
if distances are truly additive then NJ guarantees the correct tree
with strong phylogenetic signal, NJ is highly accurate
closely approximates lS and ME solutions
Most widely used distance method in phylogenetics
Negative branch length
distance data violate the triangle inequality
signal conflict or noise in the data
Tree Space and Parsimony
contains all possible tree topologies for a given number of taxa
Parsimony
maps characters onto each topology
calculating the total number of evolutionary changes
creates a conceptual landscape
What is maximum parsimony?
minimises the total number of evolutionary changes required to explain the data
Cladistic Nature of Parsimony
parsimony is explicitly cladistic
uses only synapomorphies
only phylogenetically informative characters matter
What makes a character informative?
at least two different taxa are present
each state occurs in at least two taxa
Parsimony Procedure (Unweighted)
map character-state changes onto the tree
count the number of steps per character
sum across all characters indicates tree length
record topology and length
select tree with minimum length
Fitch Algorithm
assumes characters are unordered
all changes have equal weight
uses post order traversal
efficiently compute minimum number of steps per character
How can we speed up MP analysis?
remove invariant characters
remove autopomorphic characters
remove heavily gapped characters
reuse identical character-state patterns
How can we estimate branch lengths in Parsimony?
perform a pre-order traversal
identify all possible character state reconstructions
What can transformation costs reflect?
nucleotide changes required for amino acid substitutions
empirical substitution matrices
In parsimony: frequent substitutions are down-weighted
In alignment: upweighted
Sankoff Algorithm
generalises parsomony with substitution cost matrices
uses dynamic programming
Allows:
weighted transformation
Polytomies
branch length estimation
Limitations of Maximum Parsimony
ignores much of the data
no explicit evolutionary model
sensitive to long branch attraction
heuristic searches may miss optimal trees
often multiple equally parsimonious solutions