Distance Methods and Maximum Parsimony

0.0(0)
studied byStudied by 1 person
0.0(0)
call with kaiCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/21

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

22 Terms

1
New cards

Advantages for Distance Methods

  • fast and computationally efficient

  • suitable for large datasets

  • useful for upper bounds in branch and bound search

  • perform well when phylogenetic signal is strong

  • wide range of evolutionary models available

  • long history of use

2
New cards

Disadvantages of Distance Methods

  • loss of information as it gives all as one value

  • cannot identify which sites are informative

  • difficult to incorporate indels

  • can produce unintuitive results

  • error increases with difference

3
New cards

Two step approach for distance matrix

  • calculate pairwie evolutionary distances between all taxa

  • transform the distance matrix into a tree with branch length

4
New cards

What happens if distances are additive?

  • each pairwise distance equals the sum of branch lengths connecting them

5
New cards

What happens if datasets are real

data are often non additive due to rate variation, multiple hits and noise

methods must therefore harmonise differences

6
New cards

How do we find the tree that best fits the observed distances?

  • Neighbour joining

  • Least squares

  • minimum evolution

7
New cards

Why should we never use UPGMA?

  • assumes strict molecular clock

  • produces ultrametric trees

  • extremely sensitive to rate variation

8
New cards

Neighbour Joining

Works by

  • identifying pairs of taxa that minimise total tree length

  • treating joined taxa as a single composite taxon

Features

  • does not assume a molecular clock

  • produces an unrooted tree

  • deterministic

  • progressive clustering

9
New cards

Step by Step for NJ

  1. Start with a star tree from the distance matrix

  2. calculate total tree length

  3. evaluate all possible taxon pairs

  4. select the pair that minimises total tree length

  5. join the pair into a new internal node

  6. calculate branch lengths to X

  7. recalculate distances from X to all remaining taxa

  8. reduce the matrix size by one

  9. repeat until the tree is fully resolved

10
New cards

Properties of NJ

  • returns a single unrooted tree with branch lengths

  • if distances are truly additive then NJ guarantees the correct tree

  • with strong phylogenetic signal, NJ is highly accurate

  • closely approximates lS and ME solutions

  • Most widely used distance method in phylogenetics

11
New cards

Negative branch length

distance data violate the triangle inequality

signal conflict or noise in the data

12
New cards

Tree Space and Parsimony

  • contains all possible tree topologies for a given number of taxa

Parsimony

  • maps characters onto each topology

  • calculating the total number of evolutionary changes

  • creates a conceptual landscape

13
New cards

What is maximum parsimony?

minimises the total number of evolutionary changes required to explain the data

14
New cards

Cladistic Nature of Parsimony

  • parsimony is explicitly cladistic

  • uses only synapomorphies

  • only phylogenetically informative characters matter

15
New cards

What makes a character informative?

  • at least two different taxa are present

  • each state occurs in at least two taxa

16
New cards

Parsimony Procedure (Unweighted)

  • map character-state changes onto the tree

  • count the number of steps per character

  • sum across all characters indicates tree length

  • record topology and length

  • select tree with minimum length

17
New cards

Fitch Algorithm

  • assumes characters are unordered

  • all changes have equal weight

  • uses post order traversal

  • efficiently compute minimum number of steps per character

18
New cards

How can we speed up MP analysis?

  • remove invariant characters

  • remove autopomorphic characters

  • remove heavily gapped characters

  • reuse identical character-state patterns

19
New cards

How can we estimate branch lengths in Parsimony?

  • perform a pre-order traversal

  • identify all possible character state reconstructions

20
New cards

What can transformation costs reflect?

  • nucleotide changes required for amino acid substitutions

  • empirical substitution matrices

In parsimony: frequent substitutions are down-weighted

In alignment: upweighted

21
New cards

Sankoff Algorithm

  • generalises parsomony with substitution cost matrices

  • uses dynamic programming

Allows:

  • weighted transformation

  • Polytomies

  • branch length estimation

22
New cards

Limitations of Maximum Parsimony

  • ignores much of the data

  • no explicit evolutionary model

  • sensitive to long branch attraction

  • heuristic searches may miss optimal trees

  • often multiple equally parsimonious solutions