Unprecedented technological advancements in diverse elds have led to an explosion of the amount of data stored worldwide. With the volume of information breaking the barrier of petabytes, efcient management, indexing, retrieval, and processing has occupied a central research focus in most data-intensive applications. Myriad sources such as telecommunication call data records, telescopic imagery, online transactions, web pages, stock markets, climate warning systems, medical records, etc., demand resource and compute efciency for such massively exponential data. Removal of redundancy from such multibillion record data sets constitutes an important area of study.