Big Data Kaleidoscope

Project Information

Discipline
Please Select... 
Orientation
Research 
Abstract

At http://hpc-abds.org/kaleidoscope/ we have recorded and classified ~120 software systems relevant to Big Data arena. In http://bigdatawg.nist.gov/usecases.php there are 51 Big Data use cases. We intend in collaboration with NIST Big Data working group  to extract kernels from these use cases and implement them with a few of these HPC-ABDS software environments. This implementation will be compared to the NIST reference architecture. This idea is expanded in http://www.slideshare.net/Foxsden/multifaceted-classification-of-big-data-uses-and-proposed-architecture-integrating-high-performance-computing-and-the-apache-stack. We will develop Chef and Puppet specifications of the needed software environments.

 
 

Intellectual Merit

It is complex to set up such environments. We want to provide automated deployment templates and images for the user

Broader Impacts

These kernel implementations will be incorporated in a MOOC being developed by Fox for the IU data science certificate. This will be offered to minority serving institutions through a collaboration with ADMI (Association of Computer/Information Sciences and Engineering Departments at Minority Institutions).

Project Contact

Project Lead
Geoffrey Fox (gcf) 
Project Manager
Geoffrey Fox (gcf) 
Project Members
Geoffrey Fox, Gregor von Laszewski  

Resource Requirements

Hardware Systems
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • bravo (large memory machine at IU)
  • delta (GPU Cloud)
 
Use of FutureGrid

virtual machines to test image templating

Scale of Use

smaller than 3 vms at the same time, one semester long

Project Timeline

Submitted
05/26/2014 - 19:26