ABSTRACT

Data sciences represent smart world and artificial intelligence of the future. This chapter discusses the applications of important concepts of synopsis data structures and various hashing techniques for analysis of large amounts of data in reasonable time. Low rank approximationTraditionally in information retrieval and machine learning, data is represented in the form of vectors. Any symmetric positive semi-definite matrix can be approximated as a subset of its columns using Nystrom methods. A sketch vectorforms a synopsis of the data ad the synopsis is smaller than all the original data. Data mining algorithms can be applied on the sketch vector. The high crime rates worldwide require effective protection. Create a crime prediction system by using a user's crime data to compute future crime rates. Design an information leak detection system that allows data allocation. Treat data allocation as an input that will help identify data leaks. The system should provide fast access and information retrieval.