ABSTRACT

One approach to addressing long-standing concerns associated with the taxonomic impediment and occasional low reproducibility of taxonomic data is through development of automated species identification systems. Such systems can, in principle, be combined with image-based or image- and text-based taxonomic databases to add elements of expert system functionality. Two generalized approaches are considered relevant in this context: morphometric systems based on some form of linear discriminant analysis (LDA) and 38artificial neural networks (ANNs). In this investigation, digital images of 202 specimens representing seven modern planktonic foraminiferal species were used to compare and contrast these approaches in terms of system accuracy, generality, speed and scalability. Results demonstrate that both approaches could yield systems whose models of morphological variation are over 90% accurate for small data sets. Performance of distance- and land-mark-based LDA systems was enhanced substantially through application of least-squares superposition methods that normalize such data for variations in size and (in the case of landmark data) two-dimensional orientation. Nevertheless, this approach is practically limited to the detailed analysis of small numbers of species by a variety of factors, including the complexity of basis morphologies, speed and sample dependencies. An ANN variant based on the concept of a plastic self-organizing map combined with an n-tuple classifier was found to be marginally less accurate, but far more flexible, much faster and more robust to sample dependencies. Both approaches are considered valid within their own analytic domains, and both can be usefully synthesized to compensate for their complementary deficiencies. Based on these results (as well as others reviewed here), it is concluded that fast and efficient automated species recognition systems can be constructed using available hardware and software technology. These systems would be sufficiently accurate to be of great practical value notwithstanding the fact that the already impressive performance of current systems can be improved further with additional development.