hp home products & services support solutions how to buy
spacer
hp logo - invent
corner hp labs corner
search search
contact hp contact hp
hp labs home hp labs home
about hp labs about hp labs
research research
news and events news and events
careers @ labs careers @ labs
technical reports technical reports
talks and speeches talks and speeches
worldwide sites worldwide sites
corner corner
spacer
 
SLO-Driven Hadoop
or
ARIA: Automatic Resource Inference and Allocation for MapReduce Environments

MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees (SLOs). There is a lack of performance models and workload analysis tools for automated performance management of such MapReduce jobs. None of the existing Hadoop schedulers support completion time guarantees. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance SLOs.

SLOs stands for Service Level Objectives and routinely used for defining a set of performance goals.

We have a few research threads that we pursue. They are inter-related through the set of performance tools and models that we have designed: our MapReduce job profiling approach and a set of novel performance models.

SLO-based scheduler for Hadoop
In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines. We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.

Right-Sizing of Resource Allocation for MapReduce Apps
Cloud computing offers an attractive option for businesses to rent a suitable size Hadoop cluster, consume resources as a service, and pay only for resources that were utilized. One of the open questions in such environments is the amount of resources that a user should lease from the service provider. In this work, we outline a novel framework for SLO-driven resource provisioning and sizing of MapReduce jobs. First, we propose an automated profiling tool that extracts a compact job profile from the past application run(s) or by executing it on a smaller data set. Then, by applying a linear regression technique, we derive scaling factors to accurately project the application performance when processing a larger dataset. Moreover, we design a model for estimating the impact of node failures on a job completion time to evaluate worst case scenarios.

MapReduce Simulator SimMR
To ease the task of evaluating and comparing different provisioning and scheduling approaches in MapReduce environments, we have designed and implemented a simulation environment SimMR which is comprised of three inter-related components: i) Trace Generator that creates a replayable MapReduce workload; ii) Simulator Engine that accurately emulates the job master functionality in Hadoop; and iii) a pluggable scheduling policy that dictates the scheduler decisions on job ordering and the amount of resources allocated to different jobs over time.

Meeting Service Level Objectives of Pig Programs
We consider the popular Pig framework that provides a high-level SQL-like abstraction on top of MapReduce engine for processing large data sets. Programs written in such frameworks are compiled into directed acyclic graphs (DAGs) of MapReduce jobs. We aim to solve the resource provisioning problem: given a Pig program with a completion time goal, estimate the amount of resources (a number of map and reduce slots) required for completing the program with a given (soft) deadline. We develop a simple yet elegant performance model that provides completion time estimates of a Pig program as a function of allocated resources. Then this model is used as a basis for solving the inverse resource provisioning problem for Pig programs.

Related Papers and Reports

  • Z.Zhang, L. Cherkasova, A. Verma, B. T. Loo: Optimizing Completion Time and Resource Provisioning of Pig Programs. Will appear in Proc. of Cloud Computing Optimization Workshop (CCOPT'2012), collocated with CCGrid'2012, May 13-16, 2012, Ottawa, Canada.

  • A. Verma, L. Cherkasova, V. S. Kumar, R. Campbell: Deadline-based Workload Management for MapReduce Environments: Pieces of the Perfromance Puzzle. Will appear in Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS'2012), Maui, Hawaii, USA, April, 16-20, 2012.

  • Z.Zhang, L. Cherkasova, A. Verma, B. T. Loo: Meeting Service Level Objectives of Pig Programs. Will appear in Proc. of the 2nd Intl Workshop on Cloud Computing Platforms (CloudCP'2012), in conjunction with EuroSys'2012, Bern, Switzerland, April 10, 2012.

  • A. Verma, L. Cherkasova, R. Campbell: Resource Provisioning Framework for MapReduce Jobs with Performance Goals. Proc. of the ACM/IFIP/USENIX 12th International Middleware Conference (Middleware'2011), Lisboa, Portugal, December 12-16, 2011.

  • A. Verma, L. Cherkasova, V. S. Kumar, R. Campbell: Three Pieces of the MapReduce Workload Management Puzzle. Poster at the 23d ACM Symposium on Operating System Principles (SOSP'2011), Cascais, Portugal, Oct. 23-26, 2011. Presentation Poster pdf

  • A. Verma, L. Cherkasova, R. Campbell: Play It Again, SimMR! Proc. of the IEEE Cluster 2011 (Cluster'2011), Austin, Texas, USA, September 26-30, 2011.

  • A. Verma, L. Cherkasova, R. Campbell: SLO-Driven Right-Sizing and Resource Provisioning of MapReduce Jobs. Proc. of the 5th Workshop on Large Scale Distributed Systems and Middleware (LADIS'2011), held in conjunction with VLDB'2011, Seattle, Washington, Sept. 2-3, 2011.

  • A. Verma, L. Cherkasova, R. Campbell: ARIA: Automatic Resource Inference a nd Allocation for MapReduce Environments. Proc. of the 8th IEEE International Conference on Autonomic Computing (ICAC'2011), June 14-18, 2011, Karlsruhe, Germany.

  • L. Cherkasova: Performance Modeling in MapReduce Environments: Challenges and Opportunities. Invited Talk at the 2nd ACM/SPEC International Conference on Perfromance Engineering (ICPE'11), March 14-16, 2011, Karlsruhe, Germany.


HP Labs Reports


2
printing icon
printing instructions printing instructions
Privacy Statement Legal Notices © 1994-2001 Hewlett-Packard Company