Fault management in Map-Reduce
Project Information
- Discipline
- Computer Science (401)
- Orientation
- Research
The purpose of this project is to evaluate performance penalties experienced by Map-Reduce jobs in the presence of different types of injected faults. We will begin with the Hadoop implementation of Map-Reduce. Hadoop has in built fault-tolerance mechanisms. However, these mechanisms result in performance penalties in the presence of faults as indicated by prior research on our in-house clusters as well as by other recent literature. This project will enable us to make large-scale evaluations of these penalties, especially in the heterogeneous environment provided by FutureGrid.
Intellectual MeritThe ability to predict performance for distributed applications is a challenging problem.The ability to quantify performance for the case of Map-Reduce applications will enable us to propose mechanisms to overcome these penalties, enabling Map-Reduce to be more readily used for applications requiring performance guarantees.
Broader ImpactsPerformance in the presence of faults is a critical goal for applications executing in enterprise data centers and cloud computing environments. The technologies to achieve this will be helpful to a wide range of communities both in academia, industry and government that use Map-Reduce for bioinformatics, text-mining, machine-learning, web-indexing, ad-analytics, etc.
Project Contact
- Project Lead
- Selvi Kadirvel (selvik)
- Project Manager
- Selvi Kadirvel (selvik)
Resource Requirements
- Hardware Systems
-
- alamo (Dell optiplex at TACC)
- foxtrot (IBM iDataPlex at UF)
- hotel (IBM iDataPlex at U Chicago)
- sierra (IBM iDataPlex at SDSC)
The scale of FutureGrid resources and its heterogeneity will help extend research conducted on in-house Map-Reduce clusters.
Scale of UseI would like to begin with a 16 VM cluster and be able to expand to few hundred VMs as my experiments proceed.
Project Timeline
- Submitted
- 06/29/2012 - 06:43