Hadoop Testing

Abstract

Comparison testing and experimentation for cluster-based Hadoop for comparison with performance on PSC at Blacklight.

Intellectual Merit

This project will enhance our understanding of Hadoop when used on a large shared-memory machine as compared to a traditional cluster installation, delineating use cases where each configuration will be most likely to yield the best results.

Broader Impact

With a large shared-memory Hadoop installation, scientific research that involves iterative processing of large data sets should see significant gains.

Use of FutureGrid

I will use FutureGrid nodes for testing Hadoop performance and stability in configurations that allow a more or less fair comparison with the size of jobs I'm able to run on the Blacklight system.

Scale Of Use

I will initially want to provision a cluster of 8 nodes, I will likely want to expand that cluster at some point in the future to match the scale of my experiments on Blacklight.

Publications


FG-137
Bryon Gill
Carnegie Mellon University
Active

FutureGrid Experts

Shava Smallen
Zhenhua Guo

Keywords