ABSTRACT

The authors provide an understanding of big data and MapReduce by clearly presenting the basic terminologies and concepts. They have employed over 100 illustrations and many worked-out examples to convey the concepts and methods used in big data, the inner workings of MapReduce, and single node/multi-node installation on physical/virtual machines. This book covers almost all the necessary information on Hadoop MapReduce for most online certification exams. Upon completing this book, readers will find it easy to understand other big data processing tools such as Spark, Storm, etc.



Ultimately, readers will be able to:



• understand what big data is and the factors that are involved



• understand the inner workings of MapReduce, which is essential for certification exams



• learn the features and weaknesses of MapReduce



• set up Hadoop clusters with 100s of physical/virtual machines



• create a virtual machine in AWS



• write MapReduce with Eclipse in a simple way



• understand other big data processing tools and their applications

chapter Chapter 1|45 pages

Big Data

chapter Chapter 2|65 pages

Hadoop Framework

chapter Chapter 3|39 pages

Hadoop 1.2.1 Installation

chapter Chapter 4|13 pages

Hadoop Ecosystem

chapter Chapter 5|30 pages

Hadoop 2.7.0

chapter Chapter 6|159 pages

Hadoop 2.7.0 Installation

chapter Chapter 7|13 pages

Data Science