SAGA
Abstract
The Simple API for Grid Applications (SAGA) is an OGF standard (http://www.ogf.org), and defines a high level, application oriented API for developing first principle distributed applications. Our SAGA implementation (in C++ and Python, see http://saga.cct.lsu.edu/) is able to interface to a variety of middleware backends. We also develop application frameworks based on SAGA, such as Master-Worker, MapReduce, AllPairs, BigJobs, etc.\n\n\n\nFor all those components, we intent to use futuregrid and the different software environments available on FG for extensive portability and interoperability testing, but also for scale-up and scale-out experiments. The proposed activities will allow to harden the SAGA components described above.
Intellectual Merit
The Simple API for Grid Applications (SAGA) is an OGF standard (http://www.ogf.org), and defines a high level, application oriented API for developing first principle distributed applications. Our SAGA implementation (in C++ and Python, see http://saga.cct.lsu.edu/) is able to interface to a variety of middleware backends. We also develop application frameworks based on SAGA, such as Master-Worker, MapReduce, AllPairs, BigJobs, etc.\n\n\n\nFor all those components, we intent to use futuregrid and the different software environments available on FG for extensive portability and interoperability testing, but also for scale-up and scale-out experiments. The proposed activities will allow to harden the SAGA components described above.
Broader Impact
The Simple API for Grid Applications (SAGA) is an OGF standard (http://www.ogf.org), and defines a high level, application oriented API for developing first principle distributed applications. Our SAGA implementation (in C++ and Python, see http://saga.cct.lsu.edu/) is able to interface to a variety of middleware backends. We also develop application frameworks based on SAGA, such as Master-Worker, MapReduce, AllPairs, BigJobs, etc.\n\n\n\nFor all those components, we intent to use futuregrid and the different software environments available on FG for extensive portability and interoperability testing, but also for scale-up and scale-out experiments. The proposed activities will allow to harden the SAGA components described above.
Use of FutureGrid
development, porting and testing of SAGA (see web page)\n\n - scale-out and scale-up experiments of SAGA based applications and frameworks\n\n - interoperability testing for distributed SAGA based applications (see also http://forge.gridforum.org/sf/projects/gin)
Scale Of Use
In general we have very low scale requirements, but would like to be able to test scale up and scale-out now and then, for very short periods of time (hours to a few days).
Publications
- [fg-1979] Kim, J., S. Maddineni, and S. Jha, "Characterizing Deep Sequencing Analytics using BFAST: Towards a Scalable Distributed Architecture for Next-Generation Sequencing Data", 06/2011
- [fg-1978] Kim, J., S. Maddineni, and S. Jha, "Building Gateways for Life-Science Applications using the Dynamic Application Runtime Environment (DARE) Framework",
- [fg-1977] Luckow, A., and S. Jha, "Abstractions for Loosely-Coupled and Ensemble-Based Simulations on Azure",
- [fg-1976] Luckow, A., L. Lacinski, and S. Jha, "SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems",
- [fg-1975] Sehgal, S., M. Erdelyi, A. Merzky, and S. Jha, "Understanding application-level interoperability: Scaling-out MapReduce over high-performance grids and clouds",
Results
Abstract:
Advances in many areas of science and scientific computing are predicated on rapid progress in fundamental computer science and cyberinfrastructure, as well as their successful uptake by computational scientists. The scope, scale and variety of distributed computing infrastructures (DCIs) currently available to Scientists and CS researchers is both an opportunity, and a challenge. DCI present an opportunity, as they can support the needs of a vast range and number of science requirements and usage-modes. The design and implementation of DCI itself present a formidable intellectual challenge, not least because of the challenges in providing interoperable tools and applications given the heterogeneity and diversity of DCIs.
Interoperability - standards based as well as otherwise, is an important necessary (though not sufficient condition) system and application feature for the effective use of DCI and its scalability. This project report presents a selection of results from Project No. 42 (SAGA) which makes use of FutureGrid to develop the software components, runtime frameworks and to test and verify their usage, as well as initial eorts in incorporating these strands into Graduate curriculum. Specially, we discuss our work on P* - a model for pilot abstractions, and related implementations which demonstrate (amongst others) interoperability between different pilot-job frameworks. In addition to the practical benefits interoperable and extensible pilot- job framework, P* provides a fundamental shift in the state-of-the-art of tool development: for the first time that we are aware, thanks to P* there now exists a theoretical and conceptual basis upon which to build the tools and runtime systems. We also discuss standards based approaches to software interoperability, and the related development challenges { including SAGA as a standards based generic access layer to DCIs. Finally, we conclude by establishing how these strands have been brought together in a Graduate Classroom.
The full version of the report is available here.