Skip to:

e-Science 2008 4th IEEE International Conference on e-Science

Main Conference Sessions

Reducing Time-to-Solution Using Distributed High-Throughput Mega-Workflows—Experiences from SCEC CyberShake

Authors

  • Scott Callaghan, USC/SCEC
  • Philip Maechling, USC/SCEC
  • Ewa Deelman
  • Karan Vahi
  • Gaurang Mehta
  • Gideon Juve, USC/SCEC
  • Kevin Milner, USC/SCEC
  • Robert Graves, URS Corporation
  • Edward Field, USGS
  • David Okaya, USC
  • Dan Gunter, LBNL
  • Keith Beattie, LBNL
  • Thomas Jordan, USC

Abstract

Researchers at the Southern California Earthquake Center (SCEC) use large-scale grid-based scientific workflows to perform seismic hazard research calculations as a part of SCEC’s program of earthquake system science research. The scientific goal of the SCEC CyberShake project is to calculate probabilistic seismic hazard curves for sites in Southern California. For each site of interest, the CyberShake processing includes two large-scale MPI calculations and then approximately 840,000 embarrassingly parallel post-processing jobs. We have recently completed CyberShake seismic hazard curves for nine sites. In this paper, we describe the computational requirements of CyberShake and detail how we meet these requirements using grid-based, high-throughput-oriented, scientific workflow tools. We describe the specific challenges we encountered and solutions we developed, while performing CyberShake production runs of very large (~1M job) workflows distributed across a combination of local, academic, and national computing facilities. We discuss workflow throughput optimizations we developed that reduced our time to solution by a factor of three and we present runtime statistics and propose further optimizations.

Date and Time

Wednesday, December 10, 11:30 a.m. to 12:00 p.m.

Room Number

208

More Information

Show your support for e-Science 2008

Add one of our badges to your site:

  • Teal eScience 2008 Web badge
  • Green eScience 2008 Web badge
  • Orange eScience 2008 Web badge