Hadoop built on Apache Mesos layer
Abstract
This project allocation is being requested for evaluating “Hadoop built on Apache Mesos layer” in the FutureGrid environment. The Apache Mesos layer is a common resource sharing layer over which diverse frameworks like Hadoop and Spark can run at the same time. For testing the installation, we are particularly interested in running I/O intensive and memory-intensive applications/benchmarks using the MapReduce algorithm. The lessons learnt during this project will guide our strategy for comparative study of Hadoop and Apache Mesos on different high performance computing platforms having varying architecture and performance characteristics.
Intellectual Merit
This project will attempt to demonstrate the usability of Hadoop running in the Apache Mesos environment and provide a method for researchers to run large mapreduce problems in the FutureGrid
Broader Impact
Once proven stable, Apache Mesos can be used by researchers to run large Hadoop problems in the FutureGrid and other XSEDE resources.
Use of FutureGrid
We plan to use both OpenStack and Nimbus to start virtual machines that will run a set of Hadoop benchmarks using the Apache Mesos framework.
Scale Of Use
Typical use will be a few VMs for an experiment. We may also need to gather a handful of nodes for a few days at a time.