ABSTRACT
Mechanizing hypothesis formation is an approach to exploratory data analysis. Its development started in the 1960s inspired by the question “can computers formulate and verify scientific hypotheses?”. The development resulted in a general theory of logic of discovery. It comprises theoretical calculi dealing with theoretical statements as well as observational calculi dealing with observational statements concerning finite results of observation. Both calculi are related through statistical hypotheses tests. A GUHA method is a tool of the logic of discovery. It uses a one-to-one relation between theoretical and observational statements to get all interesting theoretical statements. A GUHA procedure generates all interesting observational statements and verifies them in a given observational data. Output of the procedure consists of all observational statements true in the given data. Several GUHA procedures dealing with association rules, couples of association rules, action rules, histograms, couples of histograms, and patterns based on general contingency tables are involved in the LISp-Miner system developed at the Prague University of Economics and Business. Various results about observational calculi were achieved and applied together with the LISp-Miner system.
The book covers a brief overview of logic of discovery. Many examples of applications of the GUHA procedures to solve real problems relevant to data mining and business intelligence are presented. An overview of recent research results relevant to dealing with domain knowledge in data mining and its automation is provided. Firsthand experiences with implementation of the GUHA method in the Python language are presented.
TABLE OF CONTENTS
chapter Chapter 1|21 pages
Introduction
chapter Chapter 2|18 pages
Datasets
part I|45 pages
Procedures
chapter Chapter 3|11 pages
Principle and Simple Examples
chapter Chapter 4|17 pages
Common Features
chapter Chapter 5|15 pages
LISp-Miner System
part II|171 pages
Applying the Guha Procedures
chapter Chapter 6|15 pages
Examples Overview
chapter Chapter 7|39 pages
4ft-Miner—GUHA Association Rules
chapter Chapter 8|26 pages
CF-Miner—Histograms
chapter Chapter 9|12 pages
KL-Miner—Pairs of Categorical Attributes
chapter Chapter 10|15 pages
SD4ft-Miner—Couples of GUHA Association Rules
chapter Chapter 11|15 pages
SDCF-Miner—Couples of Histograms
chapter Chapter 12|8 pages
SDKL-Miner—Couples of Pairs of Categorical Attributes
chapter Chapter 13|17 pages
Ac4ft-Miner Action Rules
chapter Chapter 14|11 pages
GUHA Procedures and Business Intelligence
chapter Chapter 15|11 pages
Clever Miner GUHA and Python
part III|63 pages
Research and Theory