ABSTRACT

This chapter discusses exact string-matching algorithms and approximate string-matching algorithms. It describes different types of exact string-matching algorithms such as the brute force algorithm, Aho-Corasick algorithm, Boyer-Moore algorithm, Karp-Rabin algorithm, and Knuth-Morris-Pratt algorithm. The chapter provides an overview of the approximate string-matching algorithms; and presents some example approximate string-matching algorithms, such as longest common subsequence, longest increasing subsequence, and longest common substring. Exact string matching is efficient to generate signatures for polymorphic worms. Approximate string-matching algorithms use a dynamic programming method. In Machine learning algorithms, every instance in any dataset used is represented using the same set of features. RIPPER is a well-known rule-based algorithm. Instance-based learning algorithms are lazy-learning algorithms as they delay the induction or generalization process until classification is performed. Support vector machines are supervised learning models with associated learning algorithms that analyze data and recognize patterns; they are used for classification and regression analysis.