ABSTRACT

Data mining, also referred to as knowledge discovery from data, is a multidisciplinary eld. The concepts and techniques used in data mining

CONTENTS

7.1 Introduction ................................................................................................ 165 7.2 Related Work and Further Readings ....................................................... 167

7.2.1 Machine Learning ......................................................................... 167 7.2.2 Statistics ........................................................................................... 168 7.2.3 Databases and Data Warehouses ................................................. 169

7.3 Supervised and Unsupervised Learning ............................................... 170 7.4 Feature Sets and Data Quality ................................................................. 171 7.5 The Knowledge Discovery Process ......................................................... 172 7.6 Data Mining Techniques .......................................................................... 173

7.6.1 Bayes’ Theorem and the Naïve Bayes Classier ........................ 173 7.6.1.1 An Example of Bayesian Classiers ............................. 175

7.6.2 Decision Tree Induction ................................................................ 175 7.6.2.1 An Example of Decision Tree Analysis........................ 177

7.6.3 Neural Networks ........................................................................... 178 7.6.3.1 Example of Neural Network Analysis ......................... 182

7.6.4 Support Vector Machines ............................................................. 183 7.6.4.1 Examples of SVMs .......................................................... 184

7.6.5 Association Rule Mining .............................................................. 185 7.6.5.1 Examples of Association Rule Mining ......................... 186

7.6.6 Clustering Data .............................................................................. 187 7.6.7 Text Mining Clinical Narratives .................................................. 189

7.6.7.1 Examples of Text Mining ............................................... 190 7.7 Evaluating Accuracy of Data Mining Techniques ................................ 190 7.8 Chapter Summary ..................................................................................... 192 References ............................................................................................................. 193

are a convergence of machine learning (ML) and articial intelligence, database systems, and statistics (we describe this in more detail in the next section). This chapter focuses on the application of data mining techniques to the electronic health record (EHR). These techniques are characterized by an inductive approach to uncovering patterns in data and are examples of ML, or-more specically-techniques for knowledge discovery in databases.