ABSTRACT

Hierarchy and geometry support the understanding of, and working in, complex systems. Applications are overviewed for search and discovery, information retrieval, clusterwise regression, and knowledge discovery. Metric geometric and ultrametric topology have their complementary roles. There is a long tradition of research in metric methods for data analysis. The Baire metric, or longest common prefix distance, is simultaneously an ultrametric. Clustering of compounds based on chemical descriptors or chemical representations, in the pharmaceutical industry, are used for screening large corporate databases. Here, m-adic hierarchical clustering is for applications that include search and retrieval using the Baire metric, with linkages to many related topics such as hashing and precision properties of data measurement. A Baire space consists of countably infinite sequences with a metric defined in terms of the longest common prefix: the longer the common prefix, the closer a pair of sequences.