Hadoop Testing
Abstract
Comparison testing and experimentation for cluster-based Hadoop for comparison with performance on PSC at Blacklight.
Intellectual Merit
This project will enhance our understanding of Hadoop when used on a large shared-memory machine as compared to a traditional cluster installation, delineating use cases where each configuration will be most likely to yield the best results.
Broader Impact
With a large shared-memory Hadoop installation, scientific research that involves iterative processing of large data sets should see significant gains.
Use of FutureGrid
I will use FutureGrid nodes for testing Hadoop performance and stability in configurations that allow a more or less fair comparison with the size of jobs I'm able to run on the Blacklight system.
Scale Of Use
I will initially want to provision a cluster of 8 nodes, I will likely want to expand that cluster at some point in the future to match the scale of my experiments on Blacklight.