Microbial Systematics and Taxonomy Notes

Microbial Systematics involves taxonomy, which describes distinct life forms and their classification, and systematics, the study of organism diversity and their relationships. Systematics includes nomenclature (naming species), classification (grouping similar species), identification (recognizing known species), and preservation in culture collection for future reference. Systematics has applications in disease diagnosis in humans, animals, and plants, source tracking of food contaminants, and biotechnology and bioprocesses.

Life is organized into a hierarchical system: domain (the highest level of biological classification), phylum, class, order, family, genus, and species. The biological species concept defines a species as organisms that interbreed or could interbreed, with reproductive isolation protecting genotypes; it is applied in botany and zoology. The phylogenetic species concept differentiates between species and recognizes/identifies known species using DNA sequences, where the sequenced gene should be universal and not easily transferred. A polyphasic approach integrates DNA sequences (genotypic/phylogenetic) and phenotypic characteristics. Prokaryotic species definition is based on genomic species: if two organisms have <97\% 16S rRNA gene identity AND <70\% DNA-DNA hybridization identity, they are considered different species, and multilocus sequencing may be necessary for further definition.

Carl Woese pioneered the use of rRNA for phylogenetic studies in the 1970s, establishing the three domains of life: Bacteria, Archaea, and Eukarya, and providing a unified phylogenetic framework for Bacteria. The 16S Ribosomal RNA Sequence, most widely used rRNAs are small subunit ribosomal RNA (SSU rRNA) genes, found in all domains of life (16S rRNA in prokaryotes and 18S rRNA in eukaryotes), functionally constant and sufficiently conserved (change slowly), and of sufficient length. Characteristics of the 16S Ribosomal RNA Sequence include universal distribution, functional homology, regions of high sequence conservation (secondary sequence used to function), regions of high sequence variability with enough sequence to allow for sensitivity over long periods without saturation, and regions of sequence variability are phenotypically neutral; 16S rRNA can operationally classify microorganisms.

Phylogeny, the evolutionary history of a group of organisms, is inferred indirectly from nucleotide sequence data. The universal phylogenetic tree based on SSU rRNA genes is a genealogy of all life on Earth. Domain Bacteria contains at least 80 major evolutionary groups (phyla), many defined from environmental sequences alone (no cultured representatives), and many are phenotypically diverse (physiology and phylogeny are not necessarily linked). Domain Archaea consists of seven major phyla: Crenarchaeota, Euryarchaeota, Nanoarchaeota, Korarchaeota, Thaumarchaeota; two phyla do not contain cultivatable species. In Domain Eukarya, eukaryotic organelles originated within Bacteria: mitochondria arose from Proteobacteria, and chloroplasts arose from Cyanobacteria; each of the three domains of life can be characterized by various phenotypic properties.

To generate a phylogenetic tree, one must isolate DNA, amplify the 16S gene by PCR, run it on agarose gel to check for correct size, sequence it, and align the sequences to generate a tree. Step 1 involves creating a sequence alignment by aligning the sequence of interest with sequences from homologous (orthologous) genes from other strains or species. Evolutionary analysis uses character-state methods (cladistics) for tree reconstruction. Cladistic methods define phylogenetic relationships by examining changes in nucleotides at individual positions in the sequence and use those characters that are phylogenetically informative and define monophyletic groups. Common cladistic methods include algorithms like the unweighted pair group method with arithmetic mean and neighbor-joining methods, and optimality criteria to pick the best of many possible trees, such as parsimony, maximum likelihood, and Bayesian analysis.

Phylogenetic trees can be unrooted or rooted, and rotating about a node does not change tree topology. Assumptions of phylogenetic trees include that the gene under analysis was inherited from a common ancestor or vertically from a mother cell to a daughter cell; violations include convergent evolution and horizontal gene transfer. Classification of Bacteria involves polyphasic taxonomy with genotypic and phenotypic analyses. DNA-DNA Hybridization, useful for differentiating very similar organisms, provides a rough index of similarity between two organisms and is a useful complement to SSU rRNA gene sequencing. Hybridization values of 70%70\% or higher suggest strains belong to the same species, while values of at least 25%25\% suggest the same genus. The universal phylogenetic tree is based on the SSU rRNA genes and is a genealogy of all life on Earth.

Multi-Locus Sequence Typing is a method in which several different “housekeeping genes” from an organism are sequenced and has sufficient resolving power to distinguish between very closely related strains. Traditional and current classification of organisms requires biochemical and physiological analysis (as well as genetic): morphology, Gram stain, (Fatty Acid Methyl Ester) FAME, growth characteristics (temperature, pH, etc.), electron acceptors, electron donors, etc. Because we have no satisfactory way of classifying prokaryotes as species, we fall back on polyphasic taxonomy, though problems include basic inability to grow most organisms in the same manner and the dynamic nature of some of those tests (i.e., FAME results are dependent on growth conditions).

Fatty Acid profiles can be used to identify microbes when they are compared to a database of known profiles and can also be used to discern the differences between genera. Polyphasic Taxonomy based on Genotypic Analysis makes assignments to species on the basis of overall genotypic similarity, although phenotypic difference (pathogenicity for instance) should play a role in fine-scale differentiation. These methods can only be applied to microorganisms isolated from the environment and continuously cultivated in the laboratory (<1\% of known microorganisms have been cultivated). The gold standard for assigning two isolates to one species is a value of 70%\geq70\% in a standardized DNA –DNA hybridization experiment, and a simpler measure, small subunit (SSU, or 16S) rRNA sequence identity, can be used to determine what are not species (strains with <97\% identity). 16S rRNA gene sequences are utilized in environmental samples to assess the diversity of previously undescribed microorganisms, and multi-locus sequencing of common functional genes (“housekeeping genes”) is necessary to determine differences between the organism greater than 97%97\% identity.

Whole-genome sequence analyses are becoming more common, examining genome structure (size and number of chromosomes, GC ratio, etc.), gene content, and gene order. An Example of Enrichment and Isolation involves enrichment of an organism from some environment (Anaerobic, Fe(II) oxidizing Bacterium) Isolation Strain 2002, morphological characterization, physiological characterization, and DNA extraction and phylogenetic characterization of the 16SrRNA gene sequence.