ABSTRACT

Introduction ........................................................................................................ 226 Risk Management in the Requirements Phase ................................................... 227

Risk Assessment in the Requirements Phase ............................................ 227 The UML Method ......................................................................... 228 The CPN Method .......................................................................... 229

Risks Associated with Requirement Changes ........................................... 230 Risk Management in the Architecture Design Phase ......................................... 231

Quantitative Architecture-Based Reliability Models ................................ 232 Design of a Reliable Real-Time System ................................................... 233

Simplex Architecture: No Restart Capability ............................... 233 Simplex Architecture: With Restart Capability ............................ 235 Active-Standby Architecture with Restart and

Failover Capability ......................................................... 235 Sensitivity Analysis and Risk Assessment ................................................ 237

Software Reliability and Risk Management during the Coding Phase .............. 243 Defect-Prediction Model Using Size and Complexity ............................. 243 Identifi cation of Fault-Prone Module Using Multivariate Approach ........ 246

Product Metrics for Software ........................................................ 246 Software Fault-Proneness ............................................................. 247

Evaluating Development Process for Risk Management .......................... 248 Orthogonal Defect Classifi cation .................................................. 248 Capability Maturity Model Integration ......................................... 248 Fault Propagation Model .............................................................. 249

Software Reliability and Risk Management during the Testing Phase .............. 249 Software Reliability Models ..................................................................... 250

Failure Rate Models ...................................................................... 250 Software Reliability Growth Models ............................................ 251

System Test Phase Risk Assessment ......................................................... 255 Predicting Software Failure Rate and Residual Defects ........................... 255

Assessing Deployability ............................................................... 257 Field Data for Software Reliability and Risk Management ............................... 258

Controlled Introduction ............................................................................ 258

Massive Field Operation ........................................................................... 259 Metrics Estimated from Field Outage Data .............................................. 260

Exposure Time .............................................................................. 260 Outage Rates ................................................................................. 260 Outage Duration ............................................................................ 260 Coverage Factor ............................................................................ 261

Building Reliability Roadmaps and Risk Mitigation Plan ....................... 261 Summary ............................................................................................................ 262

Requirement Phase Activities ................................................................... 263 Design Phase Activities ............................................................................ 263 Coding Phase Activities ............................................................................ 263 Testing Phase Activities ............................................................................ 264 Field Operation Phase Activities ............................................................... 264

References .......................................................................................................... 264

Enterprise integration, like other software development and deployment projects, suffers chronically from cost overruns, schedule delays, unmet customer needs, and buggy systems. Frequently, this is a result of failing to address appropriately the uncertainties associated with complex, software-intensive systems. Better risk management depends on more structured and systematic ways for handling these uncertainties, particularly as they relate to the developers, to the customers and to the end-users. For all three categories of stakeholders, risk management entails assessing what can go wrong to estimate the likelihood of failures, to understand the severity of the impact, and to devise coping strategies. In the case of enterprise integration systems, the questions to answer are: is the system designed to be fault-tolerant? What is the likelihood that end-users would encounter service-affecting failures? How quickly can the failures be detected and fi xed? Finally, what is the expected average downtime, annually and per incident?