ABSTRACT

This chapter presents a model-based technology for system-level health management based on the concept of timed fault propagation graphs. It describes the modeling language, fault diagnostics, and prognostics algorithms that are applicable to complex systems, and how the techniques can be used for the health management of software systems. The chapter reviews system-level fault diagnostics techniques and also describes the failure propagation graph (TFPG) model; the reasoning algorithms used, and present examples. It deals with a discussion on how data mining techniques can be used to improve a TFPG-based system and some observations for future research. The TFPG model captures observable failure propagations between discrepancies in dynamic systems. The chapter examines example of TFPG models and the results of the experiment performed using these models. Each test case involved loading the appropriate TFPG model into the centralized TFPG reasoner, feeding the reasoner with a timed sequence of events.