Microbial Classification and Identification
Introduction
- At the end of this lecture, you should be able to:
- Define key terms used in microbial taxonomy.
- Describe the major characteristics used in bacterial taxonomy.
- Understand the methodology for studying molecular characteristics.
- Understand how classification schemes can be used to develop identification schemes.
Taxonomy: The Practice & Science of Classification
- Taxonomy has three components:
- Classification: Grouping microbes into taxa, based on their characteristics.
- Identification: Determining to which taxon an isolate belongs by determining the distinguishing characteristics of taxonomic groups.
- Nomenclature: Assigning names to taxonomic groups according to the rules.
- Taxonomy facilitates:
- Organization of microbial diversity into an accessible, logical structure.
- Arrangement into groups (taxa) with commonly understood names.
- Identification of microorganisms (clinical, industrial, environmental).
- Research predictions or hypotheses based on knowledge of similar organisms.
Classification of Living Things
- All life is classified into 3 domains, then 23 divisions.
- Viruses are considered “non-living” and do not appear on the classification diagram.
- Carl Woese, Otto Kandler, and Mark Wheelis (1990) proposed the three-domain system: Archaea, Bacteria, and Eucarya.
Taxonomy: Classification - Methods/Systems
- Carl Woese used sequence analysis (16S rRNA) to compare sequences and construct a phylogenetic tree, showing the relatedness of all living organisms and proposing archaea as a third domain.
- This was a departure from the “kingdoms” previously used to classify life (eubacteria, archaea, animals, plants, fungi, protists, chromista), which were based on phenotype.
- The use of molecular techniques and genetic analyses has revolutionized taxonomy, classification, and our understanding of the evolution of life.
- Especially in microbiology, this involves the analysis of genes (DNA sequencing) and gene products (RNA, protein).
Taxonomy: An Organizational Framework
- First component of taxonomy: Classification: grouping microbes into taxa, based on their characteristics.
- Uses a combination of:
- Classical characteristics (includes phenetic/phenotypic).
- Molecular characteristics (biochemical, e.g., proteins; genetic, e.g., DNA, RNA sequences).
- Phylogenetic analyses: Analyzes relationships between isolates using all available information.
Classification Methods & Systems
A) Classical Characteristics
- Phenetic (phenotypic) classification.
- Uses:
- Morphological characteristics.
- Physiological characteristics.
- Ecological characteristics.
- Organisms are grouped by mutual similarities (i.e., they look the same).
- The first system used.
- Still the primary basis for plant and animal classification.
- Requires consideration of multiple characteristics.
- Does not necessarily reflect evolutionary relatedness.
Classification: Phenotype
- Morphology:
- Easy to study (e.g., shape, size, color, staining, motility, spores).
- Somewhat useful for prokaryotes, useful for plants, animals, fungi.
- Physiology, biochemistry, & metabolism:
- Type of metabolism, O2 requirements, optimum growth temperature/pH, enzymes produced, ability to use diverse carbon compounds as sole sources of carbon and energy.
- Useful: directly related to the activity of many genes.
- Ecological characteristics:
- Life-cycle, diseases caused, habitat occupied.
- Useful for some organisms, not for others.
- (Assessment of genetic exchange):
- Important in many eukaryotes (ability to interbreed = species).
- Not useful for prokaryotes because they lack sexual reproduction.
Classification: Methods/Systems
C) Phylogenetic Analyses:
- Organisms are grouped based on characteristics that reflect evolutionary relationships.
- Known as the field of ‘cladistics’ (Greek for ‘branch’).
- Organisms are arranged into an evolutionary or phylogenetic tree.
- The tree can be based on:
- Fossil records.
- Anatomy and morphology.
- Physiology and biochemistry.
- Molecular characteristics (biochemical and genetic).
Classification: Molecular Characteristics
1. Biochemical Characteristics
1. Proteins
- Analyzed by mass-spectrometry (MALDI-ToF).
- Determines the profile of highly abundant proteins within a bacterial isolate.
- Compare to a database to determine identity.
2. Fatty acids
- Analyzed by fatty acid methyl ester (FAME) analysis.
- Fatty acid profile of bacteria analyzed and compared to database.
Classification: Molecular Characteristics
2. Genetic (Sequence) Characteristics
- Each gene is defined by the sequence of base pairs (A-T, G-C).
- Average gene size for prokaryotes: ~950 base pairs.
- Average gene size for eukaryotes: ~1350 base pairs.
- The base sequence of specific genes differs between species and, to a lesser extent, within species; gene variants are termed alleles.
- The larger the difference in the sequences between two organisms, the more likely they belong to different taxonomic groups.
- Sequencing is used to:
- Identify bacteria.
- Create phylogenetic trees.
Classification: Molecular Characteristics
2. Genetic (Sequence) Characteristics
- What should we sequence?
- One or more individual genes - which ones?
- Whole genome?
A) 16S ribosomal rRNA gene sequencing
- Also known as small subunit (SSU) rRNA.
- Known as a highly conserved gene.
B) Whole genome sequencing
- Used to be crazy expensive.
- Cheap as chips now.
Highly Conserved Genes (HCG)
- Highly conserved genes are:
- Found in all organisms.
- Have the exact same function in all organisms.
- Have a critical (essential) role (e.g., ATP synthesis, 16S ribosomal rRNA gene).
- Base sequence is very similar even across different organisms; no major mutations because:
- Genes/organisms cannot tolerate large mutations.
- These would be fatal for the cell.
- Differences:
- Reflect evolutionary divergence.
- Can be used for classification and identification.
- Can be used for assessing relatedness.
Nucleic Acid Sequencing
(i) Classification and Identification
- A consensus sequence is the nucleotides most commonly present at a specific position in the nucleic acid after analysis of the sequences of many organisms.
- Deviations from the consensus are characteristic of particular groups and are called signature sequences.
- Nucleic acid sequence analysis is being increasingly used in species identification.
Nucleic Acid Sequencing
- Analysis of sequences enables the construction of phylogenetic trees.
- Trees have nodes and branches.
- Node: A divergence event.
- Branch length: Represents the number of changes between two nodes.
- Used to build a picture of relationships between organisms (phylogenetics).
Whole Genome Sequencing
- Taxonomy is shifting to WGS for identification and classification.
- 16S rRNA is useful for determining ID to the genus level.
- WGS may be required for the species level.
- Other techniques (e.g., MLST, SNP) are required for sub-species/strain analysis.
- WG sequence of one organism compared to a closely related organism:
- Look for homology (measured by ANI - average nucleotide identity).
- Two genomes of the same species should have 95-96% ANI.
- WGS data can also be used to quickly calculate G+C content.
- ANI is replacing DNA-DNA hybridization.
Nucleic Acid Base Composition (%G+C)
- Was the first, simplest nucleic acid technique.
- G+C+A+TG+C * 100
- Historically determined indirectly from DNA melting temperature (Tm), the temperature at which 50% of the double-stranded DNA becomes single-stranded DNA.
G+C Content
- G+C content varies widely from 25–80%, but within a species, it varies little and does not change.
- Two unrelated bacteria can have similar G+C content but a different sequence; i.e., G+C is not a reflection of base sequence.
- So how is % G+C data interpreted?
- If % G+C of two organisms differs by >10%: genomes are quite different; organisms are not closely related.
- BUT organisms with similar G+C may not be related (different base sequences).
- Similar G+C AND similar phenotype = likely to be related.
- %G+C is easily calculated from WGS data.
DNA-DNA Hybridization
- Measures the degree of re-association between two single DNA strands.
- Tells you more about relatedness than G+C.
- Heat to separate strands of both organisms.
- Combine single-stranded DNA of both organisms.
- Cool to renature DS DNA; non-complementary bases will remain unpaired.
- Determine the degree of hybridization (determine the melting temperature of hybridized strands; a high degree requires higher temperature to separate strands).
- DDH values can be calculated from WGS data.
Identification of Subspecies and Strains
- Achieved by analyzing several genes.
- Choose genes that evolve more quickly than 16S RNA genes.
- Examples:
- Multilocus sequence typing (MLST):
- Compares sequences of several (at least 5) conserved housekeeping genes.
- Many variations (alleles) of genes exist.
- Two isolates with the exact same alleles for multiple genes indicate a very close relationship (or the same strain!).
- Single nucleotide polymorphisms (SNP):
- Targets specific, conserved regions.
- Specific genes, intergenic regions, or non-coding regions.
- Single base pair differences show evolutionary change.
Taxonomic Resolution of Molecular Techniques
- From strain to Domain. (Genome sequencing has the highest resolution and 16S rDNA sequencing has lower resolution.)
- New technologies can lead to better understanding of relationships between organisms and taxonomic changes (e.g., Pseudomonas).
Taxonomy: The Practice & Science of Classification
- Taxonomy has three components:
- Classification: Grouping microbes into taxa, based on their characteristics.
- Identification: Determining to which taxon an isolate belongs by determining the distinguishing characteristics of taxonomic groups.
- Nomenclature: Assigning names to taxonomic groups according to defined rules.
3. Identification
- Important for:
- Understanding and cataloging microbial diversity in nature.
- Giving a name (may be an existing name or a new name).
- Disease: identifying the organism causing an infectious disease.
- Industry: where particular species are used in product manufacture (e.g., cheeses, beers and wines, pharmaceuticals including antibiotics).
- Involves determining which taxon an isolate belongs using a polyphasic approach:
- A. Phenetic criteria: must know distinguishing characteristics of taxonomic groups; can develop keys comprising differentiating phenotypic characters AND/OR
- B. Molecular characteristics: compare sequence of an unknown to sequences of known organisms held in international databases, e.g., GenBank.
Phenotypic Tests
- See earlier slides (e.g., morphology (e.g., Gram stain), physiology, biochemistry & metabolism).
- Many commercial tests / kits are available that utilize key characteristics that distinguish between different genera or species (e.g., biochemical tests, e.g., API (Analytical Profile Index) strips).
- Some are:
- Easy, quick, cheap.
- Do not require fancy machines.
Use of a Dichotomous Key
- Dichotomous key example: separating genera of the Enterobacteriaceae family using phenotypic characteristics.
Nomenclature
- Names assigned to taxonomic groups according to defined rules.
- International Code of Nomenclature of Bacteria (ICNB) - same for Botanical Nomenclature (includes fungi) (ICBN) and International Committee on Taxonomy of Viruses (ICTV).
- Pioneered by Swedish botanist Carolus Linnaeus (1707-1778) - introduced a system of binomial nomenclature - each species is assigned a Latin scientific name.
- Genus name + species name
- e.g., Homo sapiens (humans)
- e.g., Escherichia coli (E. coli)
- Traditional “polyphasic” approach uses all available data: chemotaxonomic, phenotypic, and genotypic data - difficult to use for organisms identified by genetic sequence alone.
Taxonomic Ranks
- Microbes are grouped into categories (or ranks) containing similar organisms.
- Groups at each level share common properties with the group they belong to in higher ranks.
- Rank names and hierarchy are common to both phylogenetic and phenetic classification schemes.
- Strain: a genetic variant or subtype of a bacterial species (e.g., some strains of Staphylococcus aureus have antibiotic resistance genes or virulence factors, whereas others do not).
- Important ranks in microbial taxonomy: Domain, Kingdom, Phylum, Class, Order, Family, Genus, Species, Subspecies, Strain.
Taxonomy of Uncultured Microbes
- Unique genetic sequences generated by metagenomics.
- Terms for these sequences:
- ASV (amplicon sequence variant).
- OTU (operational taxonomic unit): sequences grouped on similarity (97%).
- Phylotype: an organism identified solely by nucleic acid sequence (lacks sufficient data to confirm a species name but is definitely a real organism).
- Candidatus: a “candidate species” (e.g., Candidatus Carsonella ruddii (no italics)); usually given where the candidate cannot be cultivated as a pure culture.
Resources
- Prescott’s Microbiology (12th edition).
- Chapter 1: The evolution of microorganisms and microbiology.
- Chapter 17: Microbial DNA technologies.
- Chapter 26: Exploring microbes in ecosystems.
- Pallen, MJ. Bacterial nomenclature in the era of genomics. doi: 10.1016/j.nmni.2021.100942
- How to name a prokaryote?: Etymological considerations, proposals and practical advice in prokaryote nomenclature. https://academic.oup.com/femsre/article/23/2/231/524593
- International Code of Nomenclature of Bacteria (Bacteriological Code) (1990) Revision (Lapage, S.P., Sneath, P.H.A., Lessel, E.F., Skerman, V.D.B.; Seeliger, H.P.R., Clarke, W.A., Eds.). American Society for Microbiology, Washington, DC. https://www.ncbi.nlm.nih.gov/books/NBK8817/