ABSTRACT

Modern computing systems are instrumented to generate huge amounts of system log/trace data. The data in the log/trace files indicate the status of each component and are usually collected or reported when some event occurs. Contents of the data may include the running states of the component (e.g., started, interrupted, connected, and stopped), its CPU utilization, and its parameter values. Since most computing systems record the internal operations, status, and errors by logs, it is straightforward to obtain the system events from the system logs. In this chapter we mainly focus on the methodologies of event generation from the system logs. In system management, a lot of studies investigate system event mining and develop many algorithms for discovering the abnormal system behaviors and relationships of events/system components [176, 232, 102, 133, 80, 168, 227, 120]. In those studies, the data is a collection of discrete items or structured events, rather than textual log messages. Discrete or structured events are much easier to be visualized and explored by human experts than raw textual log messages. Many visualization toolkits were developed to provide a quick overview of system behaviors over a large collection of discrete events. However, most of the computing systems only generate textual logs containing detailed information. Therefore, there is a need to convert the textual logs into discrete or structured events. In this chapter, we focus on several data mining based approaches for achieving this goal.