Skip to:

e-Science 2008 4th IEEE International Conference on e-Science

Main Conference Sessions

Provenance in Dynamically Adjusted and Partitioned Workflows

Authors

  • Daniel Goodman, Oxford University

Abstract

In this paper we describe the provenance system built into the distributed Martlet middleware. Due to both the need for scientific reproducibility, and to determine exactly what has happened with any given piece of analysis, it is necessary for this middleware to record detailed and structured provenance data in an easily query-able form. This is achieved through the use of integer clocks and directed graphs. Using these, this system is capable of keeping a complete history of the creation of all data, including the ability to store in-depth information defined by the task about the operations performed. This allows the system to continue to gather provenance data regardless of the rough grained functions being wrapped by the middleware.

The middleware was developed to support functions described in Martlet, a workflow language developed to address the problem of how to analyze the data generated by the ClimatePrediction.net experiment. This data is highly distributed and resides in a dynamic environment where the partitioning of data structures across the distributed nodes may change both in the number of pieces and their locations, and resources may come and go. This makes it necessary for the structure of the workflows to change from execution to execution. As such the provenance system is also required to be able to handle such a dynamic environment.

Date and Time

Friday, December 12, 11 a.m. to 11:30 a.m.

Room Number

206

More Information

Show your support for e-Science 2008

Add one of our badges to your site:

  • Teal eScience 2008 Web badge
  • Green eScience 2008 Web badge
  • Orange eScience 2008 Web badge