Investigate provenance collection for MapReduce

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

Demonstrate provenance collection on FutureGrid by collecting provenance from Twister and visualizing them

Intellectual Merit

The first time to investigate data provenance in Twister/Hadoop. This work is research in D2I center supervised by Professor Beth Plale

Broader Impacts

collect and investigate data provenance from MapReduce framework

Project Contact

Project Lead
Jiaan Zeng (jzeng) 
Project Manager
Jiaan Zeng (jzeng) 
Project Members
Felix Terkhorn, Abhirup Chakraborty  

Resource Requirements

Hardware Systems
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • xray (Cray XM5 at IU)
 
Use of FutureGrid

Investigate provenance collection in MapReduce on Future Grid

Scale of Use

20 or more VMs

Project Timeline

Submitted
12/27/2010 - 11:30