ABSTRACT

There are several failure management systems that have language supports for confi guring failure management (Joshi et al. 2005, Haeberlen et al. 2007, Geels et al. 2007, Levin et al. 2009, Kang et al. 2010). D-SMART, for example, has its policy language that is derived from the structure of fault detection and its recovery action (Lutfi yya et al. 2000). Similarly, CIM-SPL defi nes a rule-based language to manage CIM-compatible components (Pan et al. 2009). J. Field and C.A. Varela of IBM Research Center proposed an actor-based language for managing distributed systems in a very robust manner (Field and Varela 2005, Bloom 2009). These existing languages are specially designed for their own predicted failure models. Thus, the language capability relies strictly on its modeling capability. Since all failures are not able to be predicted nor enumerated at design time, general-purpose scripting capability is necessary to deal with any types of failures. The focus is on scripting that better handles failures such as fault detection and fault mitigation. Thus, D-Script is designed with the following properties:

• Heterogeneity-There exist many scripting languages, such as Bourne shell and Perl, which have been deployed in the fi eld of system administration. These scripting efforts are integrated into a more dependable management mechanism in D-Script,

• Security-Scripting languages support the eval operator that rewrites executable code itself at runtime. Unauthorized code

modifi cation may result in another cause of system failures, and is usually considered to be dangerous. Such a situation must be avoided,

• Robustness-The execution of scripts may fail for several reasons. Systems must preserve the consistency of system states in a case of partial failure.