ABSTRACT

Scheduling and information services are two components of grids, which play an important role in the overall performance of an application running on the grid. The information services complement the grid scheduling system. They provide information about status and availability of resources in the grid. The resources in the grid can either be physical resources such as processors, memory and network bandwidth or a service offered by a node in the grid. Scheduling a job on a node requires two considerations. First, does the resource fulfill the minimum requirements and specific QoS requirements, if any, for the execution of the job? Second, is the resource available to serve the job? Both are provided by the grid information service. However, a scheduling decision is not as simple as that. We now present some cases that complicate the scheduling decision. A task may be composed of several sub-tasks, which are executed on different nodes. These sub-tasks may have dependency among themselves in terms of their order of execution. A scheduling algorithm must consider such dependencies while making a scheduling decision. As another example consider the scheduling of a job that has a very large input file. In this case the scheduling of a task to a node should not be made independent of the data location because significant communication overhead might be involved in transferring the data to the node executing the task. A node in the grid might fail due to a hardware failure or a network failure. In such cases the grid scheduler must reschedule the task onto a different node. Such a decision is made by consulting the grid information service. In this chapter, we cover the scheduling aspects for these examples as workflow scheduling, data-intensive service scheduling and fault tolerance.