ABSTRACT

CONTENTS 5.1 Introduction 106 5.2 Formal Modeling for Resilient Systems 108 5.3 Modeling Framework 109

5.3.1 Event-B 109 5.3.1.1 Modeling Actions Over States 109 5.3.1.2 Model Renement 110

5.3.2 Temporal Logic of Actions (TLA) 110 5.3.3 The Service-as-Event Paradigm 112

5.3.3.1 The Self- Mechanism 113 5.3.3.2 Renement Diagrams 113 5.3.3.3 Applying the Service-as-Event Paradigm 116

5.4 Stepwise Formal Development of Self-Healing P2P-Based Approach 118 5.4.1 Self-Healing P2P-Based Approach 118 5.4.2 Stepwise Formal Development 119

5.4.2.1 Abstracting the Self-Healing Approach (M0) 119 5.4.2.2 Abstracting the Self-Detection (M1) 120 5.4.2.3 Abstracting the Self-Activation (M2) 122 5.4.2.4 Abstracting the Self-Conguration (M3) 123 5.4.2.5 Abstracting the Global Behavior of the System (M4) 124 5.4.2.6 Localizing Models (fromM4 toM20) 125 5.4.2.7 The Local Model (M21) 125

5.5 Discussion, Conclusion, and Future Work 127 References 128

Distributed systems and applications require ecient and eective techniques(e.g., self-(re)conguration and self-healing) for ensuring safety, security, and, more generally, dependability properties, including stabilization and resilience. The complexity of these systems is increased by several factors-for example, dynamic topology, interconnection of heterogeneous components, and automatic failure detection. This chapter presents a methodology for developing protocols satisfying safety and convergence requirements of the distributed self- systems. The self- systems are based on the idea of managing complex infrastructures, software, and distributed systems, with minimal user interactions. Correct-by-construction and service-as-event paradigms are used for formalizing the system requirements, where the formalization process is based on incremental renement in Event-B. We describe a fully mechanized proof of correctness of self- systems along with an interesting case study related to P2P-based self-healing protocols. Moreover, we also address the formal reasoning steps to introduce resilience in the distributed networks.