Lecture 4: Systematics and Phylogenetics
Learning Objectives
Describe the Linnaean system of classification and explain its limitations.
Analyze a phylogenetic tree for a group of organisms and describe the evolutionary history it portrays.
Determine if a group of organisms represents a monophyletic, polyphyletic, or paraphyletic taxon.
Evaluate evidence that morphological structures are homologous in two or more species or representatives of higher groups.
Explain the advantages and disadvantages of using molecular sequence data in a phylogenetic analysis.
Generate a parsimonious phylogenetic tree using cladistic methods to analyze a matrix of character states in a set of organisms.
Develop a phylogenetic hypothesis that can explain where in an evolutionary lineage a particular trait evolved.
Provide support for the idea that the vertical inheritance of genes is not the only mechanism through which genes are transferred from one organism to another.
The Malaria Problem
Malaria puzzled scientists for thousands of years; Hippocrates linked fevers and swollen spleens to people living near malodorous marshes.
The name malaria derives from Latin for “bad air.”
By 1900, scientists established that mosquitoes transmit the parasite to humans; these mosquitoes act as intermediate hosts (vectors).
Until the 1920s, Europeans attributed malaria to the mosquito species Anopheles maculipennis; observations of incidences challenged simple species explanations.
Variability in mosquitoes and malaria incidence suggested multiple factors; Dutch and French researchers identified two forms of a so-called species with different capacities to carry malaria.
Practical takeaway: knowing which mosquito species carry malaria informs targeted countermeasures (where to avoid, potential cures).
Nomenclature and Classification
Carolus Linnaeus: founder of modern taxonomy; system of binomial nomenclature.
Binomial nomenclature = species are assigned a Latinized two-part name (binomial): first part is the genus, second part is the specific epithet (species name).
For bacteria, naming uses a combination of the generic name and the specific epithet to yield a unique species name.
Examples:
Ursus maritimus = polar bear
Ursus arctos = brown bear
Taxonomic Hierarchy
Similarity tends to increase as you move downward in the hierarchy; closely related organisms share more characteristics.
Traits used to describe organisms should be robust; avoid relying on size or uncertain characteristics as defining traits.
Example consideration: Loxodonta africana (African elephant) and other elephants share many traits; classification reflects shared derived features rather than uncertain attributes like size alone.
Phylogenetic Trees
Phylogenetic trees are historical hypotheses about evolutionary relationships; they are built to reflect likely relationships among species and higher groups.
They are testable and revisionary as new data become available.
The breadth of analyses can be scaled to fit research questions.
Reading trees involves recognizing taxa at tips, nodes, and branches, and interpreting ancestry and descent.
First the Basics
Taxa: named classification units assigned to individuals or groups; organisms at the tips of branches.
Tip: the living or fossil group at the end of a branch (e.g., species or lineage).
Node: a speciation event or ancestral split; represents a common ancestor.
Root: the most recent common ancestor of all taxa in the tree.
Branch: a lineage segment between nodes; represents evolutionary time and divergence.
Nodes and Branches
This illustration (conceptual)
Tip examples: Leopard, Domestic cat.
Shared history of B and C indicates a common ancestor before their divergence.
Leopard and cat may be sister taxa (share a recent common ancestor).
Node = speciation event; Root = common ancestor; Branch = ancestral lineage leading to subsequent divergences.
Terms: Taxa, Phylogeny, most recent common ancestor (MRCA).
Taxa and Examples (Evolutionary Relationships)
Examples of inferred relationships:
Panthera pardus (leopard), Mephitis mephitis (striped skunk), Lutra lutra (European otter)
Canis familiaris (domestic dog), Canis lupus (wolf)
Estimated MRCA of dog and wolf; MRCA of otter, dog, and wolf; and roots leading to broader clades (e.g., Carnivora, Felidae, Canidae, Mustelidae).
These illustrate how nodes mark ancestral relationships and how clades form from common ancestors.
Monophyletic Groups (Clades)
Monophyletic taxon (a clade): an ancestral species and all its descendants.
Shared Derived Characters (synapomorphies): traits shared by members of a clade because their common ancestor possessed the trait.
Why clades matter: they represent natural evolutionary units reflecting true descent.
Phylogeny Continued: Monophyly, Polyphyly, Paraphyly
Monophyletic taxon (clade): includes an ancestral species and all of its descendants.
Polyphyletic taxon: includes species from different evolutionary lineages; the most recent common ancestor is not included.
Paraphyletic taxon: includes an ancestral species and only some of its descendants.
Node Rotation and Tree Orientation
Node rotation is arbitrary; re-arranging the positions of sister groups around a node does not alter evolutionary relationships.
Tree orientation (up/down) is also arbitrary; only branching order and grouping matter for interpretation.
Phylogenetic Trees Continued: Major Clades and Examples
ANTHROPOIDEA (New World monkeys) vs HOMINOIDEA (Old World monkeys, apes, humans).
Gibbons, Orangutans, Humans (and other great apes) within Hominoidea.
HOMININae, HOMINIDAE, HOMININI: progression from apes to humans.
Example narrative: Common ancestor of all anthropoids at the root; successive nodes represent cladogenesis events producing descendant clades (e.g., Old World monkeys vs New World monkeys; then within hominoids, the split leading to orangutans, gorillas, chimpanzees, humans; a clade of bipedal locomotion emerges within African apes and hominins).
Data Sources for Phylogenetic Analyses
Linnaeus classified organisms based on morphology and phenotypic similarities/differences (e.g., Birds as oviparous with feathers, two wings, two feet, and a bony beak).
Core premise: phenotypic similarities reflect underlying genetic similarities (homology).
Homology: the study of likeness; characters show similarity because of inheritance from a common ancestor.
Homologous vs Non-homologous (Analogous) Similarities
Homologous traits: similarity due to shared ancestry; not necessarily identical in appearance.
Analogous traits (homoplasy): phenotypic similarity that evolved independently in different lineages (convergent evolution).
Examples:
Shark and whale: streamlined shapes and tails; not homologous (one is fish, the other a mammal).
Pitcher plant leaves modified into pitchers; Venus' flytrap leaves modified into jaws; cacti spines derived from leaves (examples of adaptation, not homologous structures).
Convergence occurs when similar environmental pressures yield similar solutions in different lineages.
Assessing Homology
Differentiating homologous from homoplastic traits requires multiple lines of evidence (morphology, development, genetics, etc.).
Example: digits and limb bones (radius, ulna, humerus) across vertebrates may show different patterns of development and fusion, informing homology vs analogy.
Differentiation Through Behavior
Behavior can reveal differences where external morphology is insufficient.
Example: Hyla versicolor vs Hyla chrysoscelis (tree frogs) differ in mating calls; a prezygotic reproductive isolating mechanism.
Chromosome numbers can differ (H. chrysoscelis is diploid; H. versicolor is tetraploid); contributes to postzygotic isolation.
Molecular Sequencing and Phylogenetic Data
Modern systematists rely heavily on molecular characters (DNA/RNA sequences).
Changes in nucleotide sequences (substitutions, insertions, deletions) provide clues to relationships.
PCR (polymerase chain reaction) enables rapid amplification of specific DNA segments, even from minute samples (e.g., preserved museum specimens, some fossils).
Sequencing improvements reduce cost and increase accuracy; data are stored in online databases for comparison.
Molecular Sequencing: Advantages and Drawbacks
Advantages:
Abundant data: every base can serve as a character for analysis.
Can compare distantly related organisms lacking obvious morphological similarity.
Nucleic acids are less influenced by environmental factors that cause non-genetic morphological variation.
Drawbacks:
Potential biases and model-dependence in analyses; specifics depend on methods used.
Quick reflection: what are the limitations of relying solely on molecular data?
Traditional Classification and Paraphyletic Groups
Traditional systematics often classified by phenotypic divergence and branching patterns.
Classifications did not always strictly reflect actual branching evolution, leading to paraphyletic groupings in some traditional schemes.
The Cladistic Revolution
Cladistics = classifications based solely on evolutionary relationships, emphasizing branching patterns over overall morphology.
Ignored or deprioritized overall morphological divergence in favor of shared derived characters (synapomorphies).
Key terms:
Character: a heritable attribute (e.g., presence/absence of a trait).
Character states: ancestral vs derived states.
Ancestral character state: trait present in a distant common ancestor.
Derived character state (apomorphy): a new version in the most recent common ancestor of a group.
Synapomorphy: a derived state shared by two or more species due to inheritance from their last common ancestor.
Example concept: ancient fish fins vs limbs illustrate how derived states led to major evolutionary innovations.
Distinguishing Ancestral and Derived Character States
A crucial question: how to determine the direction of character evolution?
Outgroup comparison is a common technique: compare the group of interest to a distantly related species outside the study group to infer ancestral vs derived states.
Outgroup Comparison
Outgroup: species outside the focal group used to infer ancestral states.
By comparing characters in the ingroup to the outgroup, researchers infer which states are ancestral (shared with the outgroup) and which are derived (absent in the outgroup).
Using Synapomorphies to Reconstruct Evolutionary History
Cladistic method: group species that share derived character states (synapomorphies).
Why not rely on ancestral traits? Ancestral traits do not define the monophyly of a group; derived traits provide the signal of branching relationships.
Output: a phylogenetic tree illustrating the hypothesized sequence of evolutionary branching that produced the study organisms.
Key reminders:
A common ancestor is hypothesized at each node.
The node and all branches stemming from it form a strictly monophyletic group.
What It Looks Like in Practice: Step 1–Step 5
Step 1: Choose an organism set for the example (nine vertebrate groups): lampreys, sharks and relatives, bony fishes, amphibians, turtles, lizards (snakes), crocodilians (alligators), birds, and mammals. Outgroup: lancelets (Chordata, Cephalochordata).
Step 2: Choose characters for the analysis (examples):
Vertebral column
Jaws
Swim bladder or lungs
Paired limbs (with one bone linking each limb to body)
Extraembryonic membranes (amnion, etc.)
Mammary glands
Dry, scaly skin somewhere on the body
One opening on each side of skull in front of the eye
Feathers
Step 3: Score the characters (states for each group; e.g., −, +, or intermediate): Lancelets, Lampreys, Sharks, Bony fishes, Amphibians, Mammals, Lizards, Crocodilians, Birds.
Example table highlights which states are present (+) or absent (−) for each group across the nine characters.
Step 4: Construct the phylogenetic tree from information in the table by grouping organisms that share derived character states.
Step 5: Add the rest of the groups to complete the tree; interpret results; consider incorporating molecular sequence data to supplement the morphological matrix.
Optimizing Phylogenetic Trees
Real analyses are far more complex than simplified examples.
Often reliant on data from hundreds of characters across many species.
After scoring, computer programs generate multiple alternative trees.
Example scales of possibilities:
Five species can yield about 15 possible trees.
Fifty species can yield about possible trees.
Question: How to decide which hypothesis best represents the data?
Parsimony Approach
Principle of parsimony: traits are unlikely to evolve independently in separate lineages.
The best tree is the one that requires the smallest number of evolutionary changes to account for observed character states.
This minimizes homoplasies (convergent or parallel evolution).
Note: Parsimony is not always the best approach for molecular data; other methods may be preferred.
Statistical Approaches
To address molecular data and rate variation, statistical methods are used (not just parsimony).
These include:
Maximum likelihood methods
Genetic distance methods
Each method has advantages and drawbacks depending on data and model assumptions.
Genetic Distance Method
Concept: genetic distances increase with evolutionary divergence; closely related species have smaller distances.
Process:
Compute genetic distances between pairs (and groups) of species.
Build trees so that branch lengths are proportional to the amount of genetic change since divergence.
Pros/Cons:
Not as powerful as maximum likelihood in some cases.
Doesn’t rely on specific mutation likelihood models; requires less computing power.
Applying the Genetic Distance Method
Example with humans and great apes:
Identify the pair with the smallest genetic distance (e.g., chimpanzee and human).
Compute average distances from this cluster to other species (gorilla, orangutan).
Determine the outgroup (e.g., orangutan) based on distance patterns.
In a four-species example, these distances define the inferred phylogenetic tree.
Molecular Clocks
Concept: if mutation rates are approximately constant over time, DNA differences accumulate at a roughly steady rate, enabling dating of divergence events.
Definition: a molecular clock is a technique for dating divergence times based on the number of molecular differences.
Key ideas:
Large sequence differences imply ancient divergence; small differences imply recent divergence.
Different molecules evolve at different rates; each molecule can act as an independent clock.
Calibration uses fossil record divergence estimates or biogeographic data to translate genetic differences into time.
Example: mitochondrial DNA (mtDNA) is commonly used as a clock in certain lineages.
Phylogenetic Trees and the Comparative Method
The comparative method compares traits across species to assess homology and evolutionary origins.
Use phylogenetic context to infer when and where a trait appeared on the tree.
Example questions:
Did parental care behavior evolve independently in birds and crocodilians, or is it a synapomorphy?
Did most Mesozoic archosaurs care for their young like birds and crocodilians today?
Molecular Phylogenetic Analyses
Molecular data can pinpoint disease origins and track pathogen evolution.
Example: HIV strains:
HIV-1 is the more prevalent and virulent strain worldwide, common in central Africa.
HIV-2 occurs in West Africa.
Questions addressed: did these strains arise within human hosts or predate human transmission?
Horizontal Gene Transfer (HGT)
HGT differs from vertical gene transfer (parent to offspring).
Three major mechanisms in bacteria:
Conjugation
Transformation
Transduction
HGT appears to have occurred frequently in the history of life; estimates suggest or more of genes in contemporary bacteria originated via HGT.
Implications:
Challenges traditional views of linear, vertical evolutionary histories.
Complicates reconstruction of phylogenetic relationships across distant taxa.
Notes on Key Concepts and Symbols
Monophyly: a group consisting of an ancestor and all its descendants.
Paraphyly: an ancestor and some, but not all, of its descendants.
Polyphyly: a group that does not include the most recent common ancestor of its members.
Synapomorphy: a shared derived character state that defines a clade.
Ancestral vs Derived: ancestral states are inherited from distant ancestors; derived states arise more recently within a lineage.
Outgroup: a lineage outside the group of interest used to infer ancestral character states.
Clade: synonymous with monophyletic group in this context.
Key Formulas and Notation (LaTeX)
Parsimony objective: where is the number of evolutionary changes for character i on tree T; the best tree is .
Molecular clock concept: where is the number of differences (or substitutions) between two sequences, is the substitution rate per site per unit time, and is divergence time.
Genetic distance models can yield branch lengths proportional to the amount of genetic change since divergence.
When presenting numeric values: express as , , etc., in LaTeX style where appropriate.
Quick Recap of the Big Picture
Systematics and phylogenetics seek to reconstruct evolutionary history using morphology and/or molecular data.
Cladistics emphasizes monophyletic groups and synapomorphies to infer relationships.
Different data sources (morphology, behavior, molecular sequences) offer complementary evidence; conflicts can arise and require integrated analysis.
Modern methods combine parsimony, likelihood, and distance-based approaches, often with molecular data and computational tools.
Horizontal gene transfer adds complexity to our understanding of lineage relationships, especially in microbes.
References to Examples from the Slides (Provided Context)
Malaria vector discovery and the role of Anopheles species in Europe and beyond.
The polar examples of morphology (e.g., pendants like pitchers, fusions in limb bones) illustrating homologous vs analogous traits.
The stepwise vertebrate example illustrating character-state scoring from lancelets to birds and mammals.
The HIV example showing molecular phylogenetics in disease origins.
The illustrative tree sections showing anthropoid and hominid relationships to contextualize clades and nodes.
How to Use These Notes for Exam Prep
Focus on definitions (monophyly, paraphyly, polyphyly, synapomorphy, outgroup, ingroup).
Be able to read and interpret a phylogenetic tree: identify MRCA, sister groups, clades, and whether a taxon is monophyletic.
Distinguish homologous vs non-homologous (analogous) traits with reasoning based on ancestry and development.
Explain how data types (morphology vs molecular) influence tree construction and interpretation.
Understand methods: parsimony (with the caveat), maximum likelihood, and genetic distance approaches; know when each is advantageous.
Recognize the impact of HGT on deep evolutionary history and why it complicates tree-based narratives.