ABSTRACT

As the number of Internet users has been increased drastically over the past few years, Web log data collected on a popular Web server for even a short period of time can be huge. In order to make the discovered patterns reliable and useful, the patterns discovered from original data must be updated continuously to reflect new log data. To minimize or avoid searching the whole database whenever new data become available, several incremental versions of the basic sequence mining algorithms have utilized the candidate patterns from the previous mining results. A candidate pattern is a pattern that can become 'frequent' in the future. In the incremental mining, it is important to keep track of candidate patterns and their related information previously acquired because they have to be combined with newly found patterns in the incremental process[l].