ABSTRACT

This chapter introduces the operating principle of the phase-reconfigurable shuffle optimization. It focuses on the two concentrations: the first focus is big data and service computing in cloud computing; the other focus is phase-reconfigurable shuffle optimizations for MapReduce in cloud computing. The Shuffle phase refers to a group of procedures that are operated between Map and Reduce, in which sorting, grouping, and Hypertext Transfer Protocol (HTTP) transferring are usually included. Big data is a technical term describing the techniques for retrieving information from large-sized data, which is usually used for generating values for large volumes of data storage or processing. Using big data assumes that big data can reveal some information that cannot or hardly be acquired from traditional analysis techniques. In most situations, the data transferring uses tree-type networks and HTTP connections. There are two main steps in big data processing: integrating data and using Hadoop MapReduce.