ABSTRACT

Data is everywhere and it’s growing at an unprecedented rate. But making sense of all that data is a challenge. Data Mining is the process of discovering patterns and knowledge from large data sets, and Data Mining with Python focuses on the hands-on approach to learning Data Mining. It showcases how to use Python Packages to fulfill the Data Mining pipeline, which is to collect, integrate, manipulate, clean, process, organize, and analyze data for knowledge.

The contents are organized based on the Data Mining pipeline, so readers can naturally progress step by step through the process. Topics, methods, and tools are explained in three aspects: “What it is” as a theoretical background, “why we need it” as an application orientation, and “how we do it” as a case study.

This book is designed to give students, data scientists, and business analysts an understanding of Data Mining concepts in an applicable way. Through interactive tutorials that can be run, modified, and used for a more comprehensive learning experience, this book will help its readers to gain practical skills to implement Data Mining techniques in their work.

part I|168 pages

Data Wrangling

chapter 2Chapter 1|34 pages

Data Collection

chapter Chapter 2|16 pages

Data Integration

chapter Chapter 3|13 pages

Data Statistics

chapter Chapter 4|65 pages

Data Visualization

chapter Chapter 5|38 pages

Data Preprocessing

part II|220 pages

Data Analysis

chapter 170Chapter 6|71 pages

Classification

chapter Chapter 7|56 pages

Regression

chapter Chapter 8|58 pages

Clustering

chapter Chapter 9|14 pages

Frequent Patterns

chapter Chapter 10|19 pages

Outlier Detection