Numerical Taxonomy
Overview of Numerical Taxonomy
Definition and Synonyms: Numerical taxonomy, also known as Taximetrics or Phenetics, is a system of grouping species using numerical methods based on their character states. It involves the application of mathematical procedures to numerically encoded character state data for organisms.
Founders and Origin: The field was first initiated by Professor Peter H. A. Sneath et al. and was largely developed and popularized by Sneath and Sokal.
Core Concepts:
* Monothetic Classification: Classification based on only one or a few characters.
* Polythetic Classification: Classification based on multiple characters. Numerical taxonomy is inherently polythetic.
* Basis of Affinity: It involves analyzing taxonomic data through mathematical or computerized methods to evaluate the numerical affinities (similarities or dissimilarities) between taxonomic units, which are then arranged into taxa.
Key Software in Taximetrics
Modern numerical taxonomy relies heavily on software programs to perform multivariate statistical analysis, similarity matrices, and dendrogram construction. Key tools include:
NTSYSpc (Numerical Taxonomy and Multivariate Analysis System):
* Developed by F. J. Rohlf.
* One of the most frequently used software packages in the field.
* Capabilities: Similarity/distance matrices, Cluster analysis (UPGMA, single linkage), Principal Component Analysis (PCA), and Multidimensional scaling.PAST (Paleontological Statistics Software):
* Free statistical software widely used in biology.
* Capabilities: Cluster analysis, PCA, Similarity indices (e.g., Jaccard, Dice), and dendrogram construction.PHYLIP (Phylogeny Inference Package):
* A collection of programs for phylogenetic and clustering analysis.
* Supports distance matrix methods like UPGMA and neighbor-joining.PRIMER:
* A multivariate statistical package used primarily for ecological data analysis.
* Includes cluster analysis, PCA, MDS, and similarity analysis.Tanagra:
* An open-source data mining software supporting clustering, classification, and multivariate statistics.
Historical Development and Theory
Heywood’s Definition: Defined numerical taxonomy as the numerical evaluation of the similarity between groups of organisms and the ordering of these groups into higher-ranking taxa based on those similarities.
The Adansonian Influence: The system is often called Neo-Adansonian because it is based on principles first put forward by Michel Adanson, a French botanist.
* Adanson proposed that equal weightage should be given to all characters.
* He utilized as many characters as possible for classification.Timeline: The period from to was critical for the development of the primary methods and theory underlying numerical taxonomy.
Classification vs. Identification and Relationships
Classification: The process of grouping organisms on the basis of like properties.
Identification: The allocation of additional unidentified objects to groups after the classification has been established.
Taxonomic Relationships: Conventional taxonomists usually equate taxonomic relationships with evolutionary ones. Numerical taxonomists distinguish between three types:
1. Phenetic: Based on overall similarity.
2. Cladistic: Based on a common line of descent.
3. Chronistic: Based on the relationships among various evolutionary branches over time.
The Two Aspects of Taxonomic Grouping
Construction of Taxonomic Groups:
* Individuals are selected and their characters are identified.
* There is no limit to the number of characters; a larger number allows for better generalization of the taxa.
* Resemblances are established via character analysis, often using computers.
* The methodology emphasizes using the maximum number of characters with similar weightage.Discrimination of the Taxonomic Groups:
* When chosen groups show overlapping characters, discrimination analysis is used to select and distinguish them.
* Various specially devised techniques are employed for this purpose.
The Seven Principles of Sneath and Sokal (Neo-Adansonian Principles)
Sneath and Sokal enumerated seven fundamental principles for numerical taxonomy:
The information content of a taxon and the quality of the classification increase with the number of characters considered.
Every character should be given equal weightage when creating new taxa (apriori).
Overall similarity between two entities is a function of their individual similarities in each of the many characters compared.
Distinct taxa can be recognized because correlations of characters differ between groups of organisms.
Phylogenetic conclusions can be drawn from the taxonomic structure and character correlations of a group, assuming certain evolutionary pathways.
Taxonomy is practiced and viewed as an empirical science.
Classifications are based solely on phenetic similarity.
Methodology and Data Processing
Character Coding: Characters are recorded numerically. Differences among them are programmed to be proportional to their dissimilarity.
* Multi-state Coding Example (Hairiness of Leaf):
* Hairless =
* Sparsely haired =
* Regularly haired =
* Densely haired =
* In this system, the dissimilarity between "densely haired" () and "hairless" () is considered times greater than the dissimilarity between "sparsely haired" () and "hairless" ().
* Binary Coding: Characters are represented by two states: for absence and for presence. This is commonly used in microbiology.Data Matrix and Mapping:
* Characters and taxonomic units (OTUs - Operational Taxonomic Units) are arranged in a data matrix.
* OTUs are represented as dots in a multidimensional space where characters act as coordinates.
* Similar objects are plotted close together; dissimilar objects are farther apart.
* A Similarity Matrix is computed. Color schemes (dark-shaded areas) often indicate high similarity.
* The matrix is rearranged to identify clusters, and results are typically displayed as Phenograms (dendrograms).
Merits and Demerits
Merits
Data Improvement: Utilizes a higher number of characters from diverse sources (morphology, chemistry, physiology).
Efficiency: Highly sensitive in delimiting taxa; facilitates the creation of better keys, maps, descriptions, and catalogues via electronic data processing.
Reinterpretation: Has led to the re-evaluation of existing biological concepts and prompted fundamental changes in conventional systems.
Labor: Allows taxonomic work to be performed by less highly skilled workers once the system is established.
Demerits
Scope: Primarily useful for phenetic classification, not phylogenetic classification.
Biological Conflict: Proponents of the "biological" species concept may not accept the limits defined by these numerical methods.
Character Selection: If the chosen characters are inadequate, statistical methods may yield unsatisfactory results.
Procedural Variation: Different taxonometric procedures can yield different results. It is difficult to determine if a larger number of characters always provides more satisfactory results than a smaller, well-chosen set.
Professional Applications and Case Studies
Microbiology and Zoology: Used to study similarities/differences in bacteria, other microorganisms, and various animal groups.
Angiospermic Genera: Successfully applied to delimit genera such as Oryza, Sarcostemma, Solanum, Chenopodium, Apocynum, Crotalaria, Cucurbita, Oenothera, Salix, Zinnia, and various cultivars of wheat and maize.
Phytochemical and DNA Analysis (Mondal et al.):
* Study of interspecific variations among eight species of Cassia L. using seed protein and mitochondrial DNA (mtDNA) RFLP studies.
* Formula: The Degree of Pairing Affinity (PA) or similarity index was calculated using the method of Sokal & Sneath and Romero Lopes et al.
* Analysis Clustering (UPGMA): Dendrograms showed eight species split into two main clusters:
1. Cluster 1: C. alata, C. siamea, C. fistula, and C. reginera. Characteristics: Trees or large shrubs, absence of foliar glands on petiole/rachis, dense axillary terminal racemes > 30\,\text{cm} long.
2. Cluster 2: C. occidentalis, C. sophera, C. mimosoides, and C. tora. Characteristics: Herbs or undershrubs, presence of foliar glands, short corymbose racemes < 10\,\text{cm} long.