Big Data Kaleidoscope
Abstract
At http://hpc-abds.org/kaleidoscope/ we have recorded and classified ~120 software systems relevant to Big Data arena. In http://bigdatawg.nist.gov/usecases.php there are 51 Big Data use cases. We intend in collaboration with NIST Big Data working group to extract kernels from these use cases and implement them with a few of these HPC-ABDS software environments. This implementation will be compared to the NIST reference architecture. This idea is expanded in http://www.slideshare.net/Foxsden/multifaceted-classification-of-big-data-uses-and-proposed-architecture-integrating-high-performance-computing-and-the-apache-stack. We will develop Chef and Puppet specifications of the needed software environments.
Intellectual Merit
It is complex to set up such environments. We want to provide automated deployment templates and images for the user
Broader Impact
These kernel implementations will be incorporated in a MOOC being developed by Fox for the IU data science certificate. This will be offered to minority serving institutions through a collaboration with ADMI (Association of Computer/Information Sciences and Engineering Departments at Minority Institutions).
Use of FutureGrid
virtual machines to test image templating
Scale Of Use
smaller than 3 vms at the same time, one semester long