ABSTRACT

This chapter demonstrates the theoretical foundations of the fastest data structures for direct access to point query searches. It discusses the relevant technical concepts readers find useful for more advanced applications. The chapter deals with definitions and concepts of basic hashing, collisions, birthday paradoxes, load factors, and well known concepts like chaining and probing. It examines related topics like bipartite graph analysis in cuckoo hashing and probabilities of false positives in bloom filters. A perfect hash function for a specific that can search elements in constant time and has values in a bounded range is generally created by a randomized algorithm. Locality Sensitive Hashing Locality sensitive hashing is a data structure to perform probabilistic dimension reduction of high-dimensional data and efficiently handle similarity searches, thus defeating the “curse of dimensionality.” There are different kind of universal hash families satisfying different sets of properties and providing different bounds on collision probability.