Improve resource utilization in MapReduce
Project Information
- Discipline
- Computer Science (401)
- Orientation
- Research
Hadoop partitions physical resources into conceptual map and reduce slots to control the maximum number of tasks that can concurrently run on each slave node. We observed that this mechanism can result in low resource utilization when not all task slots on a node are used. In this project, we propose a new mechanism called resource stealing to increase resource utilization. In addition, the default mechanism to trigger speculative execution may incur the execution of many non-beneficial speculative tasks that are killed before completion. In this project, we propose Benefit Aware Speculative Execution (BASE) which reduces the number of non-beneficial speculative tasks without sacrificing performance.
Intellectual MeritThis project addresses the inefficiencies of Hadoop. Our proposed resource stealing increases resource utilization without interfering with normal Hadoop task scheduling. In addition, our proposed Benefit Aware Speculative Execution (BASE) can eliminate most of the non-beneficial speculative tasks without degrading performance.
Broader ImpactsMapReduce/Hadoop has been used by both industry and academia to run large-scale data processing applications. The proposed approaches evaluated in this project increase resource utilization, which can improve throughput. It enables users to run MapReduce jobs more efficiently, and therefore reduces job run time. So the productivity of scientists is increased because they can get results faster and tune their applications accordingly.
Project Contact
- Project Lead
- Zhenhua Guo (zhguo)
- Project Manager
- Zhenhua Guo (zhguo)
Resource Requirements
- Hardware Systems
-
- alamo (Dell optiplex at TACC)
- hotel (IBM iDataPlex at U Chicago)
- india (IBM iDataPlex at IU)
- sierra (IBM iDataPlex at SDSC)
We used the High-Performance Computing (HPC) environments provided by FutureGrid to run experiments to evaluate our proposed approaches.
Scale of UseWe used 20 - 40 of bare metal machines on a periodic basis.
Project Timeline
- Submitted
- 09/06/2012 - 15:54
- Completed
- 09/06/2012