Locality Sensitive Hashing | 24 | v2 | Encyclopedia of Computer Scienc

ABSTRACT

When querying in a massive dataset, many searching methods generate a high-dimensional vector for each object and then conduct the k-nearest-neighbor searching.[1] However, such a method is not efﬁcient when the dataset size is very large and the dimension is very high. Other methods relying on a tree structure (such as kd-trees, BDD-trees, and vp-trees) require substantial memory space and time.[2,3]

Sometimes, they are even less efﬁcient than the linear search approach that compares a query record with each record in the dataset one at a time. Moreover, all these methods compare a query with records during the searching process to locate similar records, degrading the searching performance.