After nearly 6 years of development Apache has launched Hadoop v1.0 this January which has really set the tone for 2012 to be the year of Big Data with wide adoption globally. The next challenge for Apache and its Hadoop contributors was to scale the existing MapReduce framework to support alternate programming paradigms.
YARN, the NextGen MapReduce primarily architected to divide the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM).The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager which holds the ultimate authroity to allocate resources.
While the previous version of HDFS allowed single namespace for entier cluster, version 2 (now in alpha) addresses limitation by adding support with multiple Namespaces to HDFS file system. The other features within the alpha release, but work in progress as quoted by Arun Murthy, Release Manager at Hadoop were to enable hot failover for HDFS Namespace and make Hadoop more scalable and performant.
Such a division of work with in MapReduce and addition of HDFS Federations will enable Namespace Scalability, Better Performance with addtion of multiple namespaces to cluster scales the file system read/write operations throughput, and also provide isoloation with allocation of different namespaces to differnt applications and or users.
For all those avid Hadoop explorers, the alpha release is available on Hadoop common release downlaod page.