ABSTRACT

This chapter gives commands for Hadoop 2.7.0 installation. So, it is recommended that the user may refer Hadoop 1.x installation side-by-side to understand every step in detail. For the single node setup, the chapter outlines the steps of setting path to Hadoop and running a simple wordcount job. For the pseudo-distributed mode, generating public-private key pair for passwordless communication, editing configuration files, and changing ownership, access mode, and creating Namespace are explained. For multi-node implementation, setting the domain name in all the nodes in the cluster, setting passwordless communication, and editing configuration files are explained. As a Hadoop administrator, one has several responsibilities to keep the cluster alive for application programers. Some of them are: commissioning/decommissioning slaves; check for block corruption and copy block in and out across cluster; performance tuning to optimize the network, latency, etc; trouble-shooting jobs; and upgrading to a newer version.