ABSTRACT

The efficiency of the storage system for a MPP system depends heavily on the data-movement scheme implemented by the file system. This is primarily because MPP systems, unlike traditional clusters, tend to use a “partitioned architecture” [168] (illustrated in Figure 14.1) where the system is divided into sets of nodes that have different functionality and requirements for hardware and software. For example, the Cray XT-3 and IBM Blue Gene systems have compute, I/O, network, and service nodes. The compute nodes use a “lightweight kernel” [273, 369, 425] operating system with no support for threading, multi-tasking, or memory management. I/O and service nodes use a more “heavyweight” operating system (e.g., Linux) to provide shared services. Recent decisions to use Compute Node Linux on Cray XT-4 compute nodes make available some of these services, but Compute Node Linux still only provides a restricted set of functionality when compared to a standard Linux distribution. These restrictions represent a significant barrier for porting file systems originally designed for clusters of workstations.