Depict the evolutionary process as a branching process.
One lineage splits into two daughter lineages.
Expectation of taxa relationships.
Bifurcating tree is a model that summarizes evolution.
Phylogenetics is rich in jargon; familiarity is needed.
Units at the tips of the tree can be species, higher taxa (genera, families, etc.), individuals, samples, or gene lineages.
Direct window into the past.
Ideal: complete sequence of fossils placed unambiguously along branches.
Reality: fossilization is rare; incomplete fossil records.
Phylogenetics: using characteristics of extant species to reconstruct evolutionary history.
Homology: Similarity by descent from a common ancestor.
Example: Tetrapod limbs are homologous characters inherited from a common ancestral tetrapod.
Homologous characters can be shaped by natural selection into different forms but retain a common underlying structure.
Convergence: Similarity by independent evolution.
Example: Wings in bats and birds evolved separately.
Wing structures are different in birds and bats, indicating convergent evolution.
Related to Occam's Razor: simplest explanation is preferred when there is nothing else to go on.
Fewer ad hoc assumptions.
60 species of macropods, 11 arboreal (tree kangaroos).
Question: Evolutionary history of arboreal lifestyle.
Hypotheses:
Each tree kangaroo species independently evolved tree-dwelling lifestyle (11 changes).
Ancestral macropod was a ground dweller, and tree dwelling evolved in the common ancestor of tree kangaroos (1 change).
Ancestral macropod was a tree dweller, with a change to ground dwelling at the base of the tree, and then a change back to tree dwelling in the tree kangaroo ancestor (2 changes).
The second hypothesis is the most parsimonious.
Example: Three hypothetical species (A, B, C) with seven phenotypic characters.
Binary characters: presence or absence.
Outgroup: closely related but not part of the ingroup; lacks derived characters.
Process:
List all possible branching arrangements.
Determine the number of character changes needed to support each arrangement.
Choose the tree that requires the fewest changes.
For three species, there are only three possible branching arrangements.
The tree requiring the fewest evolutionary changes is considered the most parsimonious.
When a character must have evolved more than once given a particular phylogenetic tree.
Convergent character for at least two species.
Parsimony minimizes homoplasy.
Homologous characters (evolved once) are most useful.
For more than three species, it's impossible to calculate parsimony scores for all possible trees due to the massive increase in possible topologies.
Heuristic Search:
Imagine tree space as a landscape with height representing parsimony.
Start with a random tree, calculate its parsimony score.
Make a small change to the tree (e.g., swapping two species).
Recalculate the parsimony score; keep the new tree if it's more parsimonious.
Repeat this process to "climb" the hill to the most parsimonious tree.
Problem: Getting stuck on local optima.
Solution: Start the process at multiple points to maximize the chance of finding the global optimum tree.
DNA carries a record of evolutionary change.
Mutations, natural selection, and advantageous changes are recorded in DNA sequences.
Multiple Sequence Alignment:
Arranging sequences to identify homologous characters.
Each site (column) in the alignment represents a homologous character.
Nucleotides (A, T, G, C) are the character states.
Grouping Species based on Sequence Similarity
Divide species into clades based on shared nucleotides at particular sites.
Alternative branching arrangements are possible, so the order in which sites are considered matters.
Measure of overall similarity/distance between sequences for each pair of species.
Constructed using sequence data.
Algorithms like neighbour-joining group species with the shortest distances.
Used as a first approximation for large datasets.
High rate of change in DNA sequences erases previous history of change at a site.
Assuming one difference equals one evolutionary event can underestimate evolution.
Estimate the probability of change from one base to another along a branch.
Depends on:
Rate of change (\mu): Changes per unit time.
Time elapsed (t).
Evolutionary distance: \mu \times t
Substitution models account for:
Redundancy in the genetic code (changes in the third codon position are less likely to be deleterious).
Transition-transversion bias (transitions are more frequent than transversions).
Transition is purine to purine (A to G) or pyrimidine to pyrimidine (T to C) which are often less disruptive because they will change to codons that code to the same amino acid.
Transversion is a change from a purine to a pyrimidine or vice versa, which tends to be less frequent and can have greater effects on the resulting protein structure due to the more significant alteration in the DNA sequence.
Maximum Likelihood:
Finds the set of parameters (topology, branch lengths, substitution model parameters) that maximizes the probability of seeing the observed sequence data.
Hill-climbing process as with parsimony.
Bayesian Phylogenetics:
Incorporates maximum likelihood with prior information.
Estimates posterior probabilities by updating prior beliefs with new data.
Classification of Primates:
Traditional view: Prosimians (lemurs, lorises, tarsiers) vs. higher primates (monkeys, apes).
Molecular data indicates tarsiers group with cemuiforms (monkeys and apes), not strepsirrhines (lemurs and lorises).
Prosimians were paraphyletic.
New classification: Strepsirrhines (lemurs, lorises) and Haplorrhines (tarsiers, monkeys, apes) are monophyletic clades.
Converting phylograms (branch lengths reflect amount of evolution) into chronograms/time trees (branch lengths proportional to time).
Molecular Clock
Ideal: DNA substitutions tick over at a regular constant rate.
Reality: Substantial variation in the rate of molecular evolution across the tree and through time.
Associated with biological, life history, and environmental characteristics.
Calibration Using External Data
Fossil Record: Fossil age estimate provides a minimum age bound.
Geological/Biogeographic Events: Island formation puts a maximum bound on divergence.
Hawaiian Islands: Precisely estimated ages of formation used to date divergences of endemic species.
Uncertainty in Age Estimates
Arises from variation in molecular clock, dating of fossils, etc.
Represented by uncertainty bounds on the age estimates of nodes.