ABSTRACT

Load balancing and data placement presented in the last two chapters are to ensure scalability and high availability of Internet servers. A request accepted by a server is often handled in a best-effort manner to maximize the quality of service (QoS). There are many situations where the requests from different clients for the same or different contents should be treated differently. For example, media requests from a black/white and low resolution PDA should never be treated equivalently as those from a high-end desktop with a broadband network. QoS differentiation is a key technology for such purposes. In this chapter, we review the QoS-aware request scheduling and resource management techniques on Internet servers.