Optimizing Scientific Workflows on Clouds

Project Information

Discipline
Computer Science (401) 
Subdiscipline
11.04 Information Sciences and Systems 
Orientation
Research 
Abstract

This project aims to run scientific workflows on clouds and attempts to optimize the performance with many attractive features, such as virtualization and on-demand provisioning. We plan to examine several benchmark workflows such as Montage (an astronomy application), Epigenomics ( a pipeline workflow) and CyberShake ( a seismographic application). This project also aims to integrate the Pegasus Workflow Management System and Virtual Infrastructure System.

Intellectual Merit

This project aims to address a newly emerging problem: how to improve the performance of scientific workflows on the popular cloud platforms? Scientists are considering to migrate the workflow execution environment from their own infrastructure to a more cost-effective platform. The challenge is to create a virtualization system that seamless integrates the workflow management system and execution engine. The team is well prepared to undertake these challenges, with strong experience in data intensive workflows, data placement services, dynamic virtual machine provisioning, grid computing and other past projects.

Broader Impacts

This project aims to enhance the understanding of scientific workflows and cloud computing. Based on this, the team members would give science and engineering presentations to the community and participate in multi- and interdisciplinary conferences, workshops and other research activities.

Project Contact

Project Lead
Weiwei Chen (weiweich) 
Project Manager
Weiwei Chen (weiweich) 
Project Members
Craig Ward, David Smith, soma prathibha, Jia Li  

Resource Requirements

Hardware System
  • Not sure
 
Use of FutureGrid

This project intends to launch several virtual machines provided by FutureGrid and build a runtime environment associated with workflow management systems. The team members would like to configure, launch, and store the virtual machines.

Scale of Use

Every experiment would require about 32 VMs for a few days.

Project Timeline

Submitted
06/06/2011 - 03:30