FRIEDA: Flexible Robust Intelligent Elastic Data Management

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

Scientific applications are increasingly using cloud resources for their data analysis workflows. We use the cloud loosely to signify transient environments. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance, cost trade-offs, complex application choices and complexity associated with elasticity, failure rates in these environments. The different data access patterns for data-intensive scientific applications require a more flexible and robust data management solution than the ones currently in existence.  FRIEDA is a Flexible Robust Intelligent Elastic Data Management framework that employs a range of data management strategies approaches in elastic environments.

Specifically, we are investigating

  • Semi-automated storage choices and data management strategies for data analysis science workflows
  • Management of the life cycle of scientific applications in hybrid environments using HPC and cloud resources.

Intellectual Merit

The proposed research will significantly impact current understanding on storage and data management in CS. The approaches will be useful for data-intensive applications in both cloud and future HPC environments

Broader Impacts

The project is expected to impact scientific discoveries through use of cloud resources.

Project Contact

Project Lead
Lavanya Ramakrishnan (lavanya) 
Project Manager
Lavanya Ramakrishnan (lavanya) 
Project Members
Val Hendrix, Pradeep Kumar Mantha, Eugen Feller, Zhao Zhang, Tonglin Li, Mike Hance, Sowmya Balasubramanian  

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • foxtrot (IBM iDataPlex at UF)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • xray (Cray XM5 at IU)
  • bravo (large memory machine at IU)
  • delta (GPU Cloud)
  • Network Impairment Device
 
Use of FutureGrid

FutureGrid will be used to investigate the following issues - * trade-offs between different storage options in cloud environments * elastic data management at scale in cloud environments

Scale of Use

Many of the things we need to test will need to be done at scale since a lot of the data management issues are at scale. We will start with a few nodes used for development, testing and will then need to be able to scale up for tests. We will take as much as you can give us over the sites.

Project Timeline

Submitted
12/06/2012 - 13:38