Preactivity 6 Notes: Phylogenetics

Phylogenetics and Character State Changes

Learning Objectives

  • Use a data matrix to map character state changes on a phylogenetic hypothesis.
  • Calculate the minimum number of character state changes using the data matrix and the phylogeny.
  • Calculate tree length and consistency index.
  • Determine which phylogenetic hypothesis is best supported or most parsimonious using tree length and consistency index.
  • Apply phylogenetics to solve real-world biological problems.

Real-World Application: Viral Origins

  • Phylogenies are crucial for determining the likely host and country of origin of pathogenic viruses, such as in the COVID-19 pandemic.
  • Ancestry, as revealed by phylogenies, is a powerful tool for solving biological problems.
  • Phylogenies are essential in conservation biology (endangered species) and medicine (origins of viruses like COVID-19 and HIV).

Phylogenies and Viral Strains

  • Scientists build phylogenies for new strains of viruses like COVID-19 to understand their relationships.
  • Mutation drives constant change, leading to new strains indefinitely.
  • Phylogenies help determine the origin and spread of new strains.
  • This information is valuable for informing public health measures, such as travel restrictions.

Phylogenetic Hypotheses

  • Multiple phylogenetic hypotheses can exist for a group of taxa (e.g., virus strains).
  • The challenge is to determine which hypothesis is best supported by the evidence.

Tree Length

  1. Mapping Character States:

    • Observations from the data matrix (character states) are mapped onto the branches of the phylogenetic tree.
    • Character states for common ancestors are often missing and need to be inferred.
  2. Inferring Character States using Parsimony:

    • Various possibilities for ancestral character states are tested to find the most parsimonious combination.
    • The most parsimonious arrangement is the one that minimizes the number of character state changes.
  3. Calculating Tree Length:

    • Map all characters, indicate character states, and tally the number of changes on each possible phylogeny.
    • The tree length is the total number of character state changes for a given phylogeny.

Minimum Number of Character State Changes

  1. Data Matrix:

    • Relies solely on the data matrix.
  2. Calculation:

    • The minimum number of character state changes for a set of taxa is the sum of the minimum number of changes for each character.
    • For a character with n states, the minimum number of changes is n - 1.
      • Example: Tail color has 2 character states (light and dark). Minimum number of changes = 2 - 1 = 1.
  3. Total Minimum Changes:

    • Sum the minimum changes for all characters.
    • Example: If tail color, horns, and segments each have two character states, the total minimum changes = 1 + 1 + 1 = 3.
  4. Significance:

    • Represents the best-case scenario, independent of any specific phylogeny.
    • Used to assess how well a given hypothesis fits the data by comparing the tree length to this minimum value.

Consistency Index (CI)

  1. Formula:

    • The consistency index (CI) is calculated as the minimum number of character state changes divided by the tree length:
      CI = \frac{Minimum}{Tree \ Length}
  2. Interpretation:

    • A CI of 1.0 indicates a perfect hypothesis, where the observed tree length matches the minimum possible number of changes.
    • A higher CI suggests a better-supported hypothesis.
  3. Example:

    • If the minimum number of changes is 3, and the tree lengths for three hypotheses are 4, 3, and 4, then the CIs are:
      • Hypothesis 1: CI = \frac{3}{4} = 0.75
      • Hypothesis 2: CI = \frac{3}{3} = 1.0
      • Hypothesis 3: CI = \frac{3}{4} = 0.75
    • Hypothesis 2 is the best supported because it has a perfect consistency index.

Common Mistakes

  1. Incorrect Approach:

    • Ignoring the phylogeny and common ancestors, and only considering the differences between taxa.
    • This approach will lead to incorrect answers in most cases.
  2. Correct Approach:

    • Taking into consideration the common ancestors and the path of change from the bottom of the tree to the top.
  3. Conceptual Misunderstanding:

    • The incorrect approach implies that one taxon evolved directly into another (e.g., taxon C evolved into taxon D), which is a non-phylogenetic perspective.
    • The correct approach recognizes that taxa share a common ancestor (e.g., taxa C and D share a common ancestor).

Worksheet Instructions

  1. Data Matrix:

    • The worksheet includes a data matrix with character states for different taxa (e.g., capsule shape, spike protein number, envelope protein for taxa A, B, C, D, and E).
  2. Character State Mapping:

    • Map character states on the provided phylogeny to indicate where character states occur and where changes occur.
  3. Calculations:

    • Calculate the tree length, minimum number of character state changes, and consistency index for the assigned hypothesis and record the data in the provided locations in the worksheet.
  4. Submission:

    • Upload the completed worksheet as a PDF or JPEG file.
    • Ensure that the character mapping and all calculated values are clear and accurate.