ABSTRACT

Data mining has been attracting increasing attention in recent years. Automated data collection tools and major sources of abundant data ranging from remote sensing, bioinformatics, scientific simulations, via web, e-commerce, transaction and stock data, to social networking, YouTube, and other means of data recording have resulted in an explosion of data and the paradox known as drowning in data, but starving for knowledge. Data mining represents tasks ranging from association rules and regression analysis, to various intelligent and machine learning techniques such are neuro-fuzzy systems, support vector machines, and Bayesian techniques. Operational databases are based on entity-relationship (ER) data models or schemas, which describe the set of entities and relationships among them. Data warehouse schemas reflect the subject-oriented schemas, more suitable for on-line analytical processing. Transaction-oriented on-line transaction processing systems are responsible for known operations, such as daily searching for particular records, higher performance, and availability of flat relational, or ER types of transactions.