ABSTRACT

The Yet Another Resource Negotiator (YARN) pools cluster resources and shares them across tools and frameworks. YARN is essentially a software system for managing different distributed frameworks. It manages and shares cluster resources in a fine-grained manner among different frameworks deployed in the same cluster for better cluster utilization. Hadoop cluster traffic is high during data loading, replication, intermediate data transfer, and other such tasks. The increasing memory size of map and reduce tasks comes at the expense of a reduction in parallelism as it minimizes the number of containers possible in a node. The Node Manager runs auxiliary services such as shuffle, sort, and group during task execution for MapReduce jobs.