Cloud Query Processing System for Big Data Management | 27

ABSTRACT

This chapter discusses a query-processing system that functions in the cloud and manages a large number of resource description framework triples. As the popularity of cloud computing grows, the service providers face ever increasing challenges. They have to maintain huge quantities of heterogeneous data while providing efficient information retrieval. The key emphasis for cloud computing solutions is scalability and query efficiency. Semantic web-based social networks provide the ability to specify and query heterogeneous data in a standardized manner. The chapter shows that the ideal model is not practical or cost-effective. There are several issues that make the ideal model less attractive in practice. The ideal model considers simple abstract costs, namely, the number of triples read and written by the different phases, ignoring the actual cost of copying, and sorting, these triples, and the overhead for running jobs in Hadoop. The chapter presents a heuristic and a greedy approach to generate an approximate solution to generate the best plan.