Hadoop Testing

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

Comparison testing and experimentation for cluster-based Hadoop for comparison with performance on PSC at Blacklight.

Intellectual Merit

This project will enhance our understanding of Hadoop when used on a large shared-memory machine as compared to a traditional cluster installation, delineating use cases where each configuration will be most likely to yield the best results.

Broader Impacts

With a large shared-memory Hadoop installation, scientific research that involves iterative processing of large data sets should see significant gains.

Project Contact

Project Lead
Bryon Gill (bgill) 
Project Manager
Bryon Gill (bgill) 

Resource Requirements

Hardware System
  • Not sure
 
Use of FutureGrid

I will use FutureGrid nodes for testing Hadoop performance and stability in configurations that allow a more or less fair comparison with the size of jobs I'm able to run on the Blacklight system.

Scale of Use

I will initially want to provision a cluster of 8 nodes, I will likely want to expand that cluster at some point in the future to match the scale of my experiments on Blacklight.

Project Timeline

Submitted
07/26/2011 - 12:02