ABSTRACT

MapReduce [Dean and Ghemawat, 2008] has emerged as one of the most popular frameworks for distributed cloud computing. The simple but powerful programming model is beneÀcial to a wide spectrum of data-intensive applications such as search indexing, mining social networks, recommendation services, and advertising backends. These applications enable computing datacenter support in carrying out daily activities as well as solving social problems. Since computing is becoming more instrumented, interconnected, intelligent and pervasive than ever before, it brings many challenges in systems design, modeling and engineering. There are emerging classes of cloud-based applications that can beneÀt from increasing time guarantee. For example, real-time advertising requires a real-time prediction about user intent based on their search histories. Meeting deadlines here can translate into higher proÀts for the content providers. In control datacenters, enormous amounts of real-time data need to be collected and reported periodically by various sensors. Besides that, ambient intelligence needs a networked database to integrate these sensor data streams in time and to give a real-time analysis result according to event requests. Therefore, computing in clouds, where billions of events occur simultaneously, is not in a time linear dimension, but falls into the real-time computing category.