Running workflows in the cloud with Pegasus

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

In this work we intend to study the benefits and drawbacks of using cloud computing for scientific workflows. In particular, we are interested in the benefits of specifying the execution environment of a workflow application as a virtual machine image. Using VM images has the potential to reduce the complexity of deploying workflow applications in distributed environments, and allow scientists to easily reproduce their experiments. In addition, we are interested in investigating the challenges of on-demand provisioning for scientific workflows in the cloud.

Intellectual Merit

Cloud computing is an important platform for future computational science applications. It is particularly well-suited for loosely-coupled applications such as scientific workflows, which do not require the high-speed interconnects and large, shared file systems typical of existing HPC systems. However, many of the current generation of workflow tools have been developed for the grid and may not be ready for use in the cloud. Although the cloud has many potential benefits, it also brings many additional challenges. We plan to investigate the use of clouds for workflows to determine what tools and techniques the workflow community will need to develop so that scientists using workflow technologies can take advantage of cloud computing.

Broader Impacts

Many different science applications in physics, astronomy, molecular biology and earth science are using the Pegasus workflow management system in their research. These groups are interested in the potential benefits of cloud computing to improve the speed, quality, and reproducibility of their computational workloads. We intend to apply what we learn in using FutureGrid to develop tools and techniques to help scientists do their work better.

Project Contact

Project Lead
Gideon Juve (juve) 
Project Manager
Gideon Juve (juve) 
Project Members
Sepideh Azarnoosh  

Resource Requirements

Hardware System
  • Not sure
 
Use of FutureGrid

Running workflow applications in the cloud using Pegasus

Scale of Use

I only need a few VMs. No more than 128 cores at a time.

Project Timeline

Submitted
11/05/2010 - 16:53