Hadoop built on Apache Mesos layer

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

This project allocation is being requested for evaluating “Hadoop built on Apache Mesos layer” in the FutureGrid environment. The Apache Mesos layer is a common resource sharing layer over which diverse frameworks like Hadoop and Spark can run at the same time. For testing the installation, we are particularly interested in running I/O intensive and memory-intensive applications/benchmarks using the MapReduce algorithm. The lessons learnt during this project will guide our strategy for comparative study of Hadoop and Apache Mesos on different high performance computing platforms having varying architecture and performance characteristics.

Intellectual Merit

This project will attempt to demonstrate the usability of Hadoop running in the Apache Mesos environment and provide a method for researchers to run large mapreduce problems in the FutureGrid

Broader Impacts

Once proven stable, Apache Mesos can be used by researchers to run large Hadoop problems in the FutureGrid and other XSEDE resources.

Project Contact

Project Lead
John Lockman (jlockman) 
Project Manager
John Lockman (jlockman) 
Project Members
Ritu Arora  

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • foxtrot (IBM iDataPlex at UF)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • xray (Cray XM5 at IU)
  • bravo (large memory machine at IU)
 
Use of FutureGrid

We plan to use both OpenStack and Nimbus to start virtual machines that will run a set of Hadoop benchmarks using the Apache Mesos framework.

Scale of Use

Typical use will be a few VMs for an experiment. We may also need to gather a handful of nodes for a few days at a time.

Project Timeline

Submitted
06/12/2013 - 17:50