ABSTRACT

As communication in a natural language is most common to people, finding useful knowledge in texts has become very attractive with the increased activity of people on the web. Text mining as a discipline focused on knowledge discovery, usually in a large number of texts, can use statistical and machine learning methods for the analysis. Thus, a strong relation to data mining can be identified. The chapter describes some basic data and text mining tasks, specifics of text mining, and ideas behind the inductive machine learning approach (an approach where some general conclusions are found based on the availability of many specific examples). Three main directions, supervised, unsupervised, and semisupervised learning are introduced.