ABSTRACT

The versatile capabilities and large set of add-on packages make R an excellent alternative to many existing and often expensive data mining tools. Exploring this area from the perspective of a practitioner, Data Mining with R: Learning with Case Studies uses practical examples to illustrate the power of R and data mining.

Assuming no prior knowledge of R or data mining/statistical techniques, the book covers a diverse set of problems that pose different challenges in terms of size, type of data, goals of analysis, and analytical tools. To present the main data mining processes and techniques, the author takes a hands-on approach that utilizes a series of detailed, real-world case studies:

  1. Predicting algae blooms
  2. Predicting stock market returns
  3. Detecting fraudulent transactions
  4. Classifying microarray samples

With these case studies, the author supplies all necessary steps, code, and data.

Web Resource
A supporting website mirrors the do-it-yourself approach of the text. It offers a collection of freely available R source files that encompass all the code used in the case studies. The site also provides the data sets from the case studies as well as an R package of several functions.

chapter |7 pages

Histogram of algae$mxPH

algae$mxPH

chapter |9 pages

minO2 minO2

winter summer spring autumn

chapter |12 pages

medium

small large

chapter |6 pages

| 98

PO4>=43.82 PO4< 43.82 n=147 Cl>=7.806 Cl< 7.806 mxPH< 7.87 mxPH>=7.87

chapter |76 pages

Linear Model Regression Tree

chapter |14 pages

SP500

slide.nnetR.v15 single.nnetR.v12 grow.nnetR.v12

chapter |36 pages

Transactions per salespeople

Salespeople Products

chapter |6 pages

Original Data SMOTE'd Data

chapter |3 pages

Distribution Properties of the Selected Genes 10 12 Median expression level

5.3.3 Filtering Using Random Forests

chapter |20 pages

"+" ALL1/AF4; "−" BCR/ABL; "*" E2A/PBX1; "|" NEG

X35164_at X2062_at X37967_at X37403_at X39837_s_at