Grid Computing: Making the Global Infrastructure a Reality Abstracts for Concurrency and Computation:Practice and Experience

Grid Computing: Making the Global Infrastructure a Reality

This is set of abstracts of a collection of articles "Grid Computing: Making the Global Infrastructure a Reality " edited by Fran Berman, Geoffrey Fox and Tony Hey. This is a book (over 1000 pages) published March 2003 by Wiley
Overview

Chapters in order received

See Overview for list in order to appear in book

C590: The Semantic Grid: A Future e-Science Infrastructure
- Abstract:e-Science offers a promising vision of how computer and communication technology can support and enhance the scientific process. It does this by enabling scientists to generate, analyse, share and discuss their insights, experiments and results in an effective manner. The underlying computer infrastructure that provides these facilities is commonly referred to as the Grid. At this time, there are a number of grid applications being developed and there is a whole raft of computer technologies that provide fragments of the necessary functionality. However there is currently a major gap between these endeavours and the vision of e-Science in which there is a high degree of easy-to-use and seamless automation and in which there are flexible collaborations and computations on a global scale. To bridge this practice-aspiration divide, this paper presents a research agenda whose aim is to move from the current state of the art in e-Science infrastructure, to the future infrastructure that is needed to support the full richness of the e-Science vision. Here the future e-Science research infrastructure is termed the Semantic Grid (Semantic Grid to Grid is meant to connote a similar relationship to the one that exists between the Semantic Web and the Web). In particular, we present a conceptual architecture for the Semantic Grid. This architecture adopts a service-oriented perspective in which distinct stakeholders in the scientific process, represented as software agents, provide services to one another, under various service level agreements, in various forms of marketplace. We then focus predominantly on the issues concerned with the way that knowledge is acquired and used in such environments since we believe this is the key differentiator between current grid endeavours and those envisioned for the Semantic Grid.
- David De Roure, Nicholas Jennings and Nigel Shadbolt
- Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK
- dder@ecs.soton.ac.uk
- Received January 23 2002, Comments to Authors May 29 2002; Accepted July 8 2002
C598: Implementing Production Grids
- Abstract: Starting from Section 2, "The Grid Context," we lay out our view of a Grid architecture, and this definition provides a structure for the subsequent detailed description. In particular we identify what it is that differentiates a Grid from other structures for distributed computing, e.g. hierarchical clusters. The question of what is a minimum set of Grid services - the Grid Common Services, the neck of the hourglass model - and what needs to be added to make the Grid usable for particular communities is stated. Issues of interoperability and heterogeneity are addressed, and these are perhaps the most important distinguishing features of a Grid.
  Section 3, "The Anticipated Grid Usage Model Will Determine What Gets Deployed, and When," addresses the question of Grid building from the points of view of various different types of Grid usage. This is an important point because differing usage patterns require different middleware, this is why the distinction of a minimal common set of Grid services and tools is so important. The underlying case studies have a supercomputing background, and so attention is given to the problems of coupling and synchronicity of resources that are not required in other sorts of Grids, e.g. Data Grids and Grids based on the seti@home concept (e.g. Entropia). This is why interoperability is so important, different usages of Grids will result in different middleware, scheduling strategies and tools for collaboration. The work of the Global Grid Forum is vital in ensuring that standards are defined so that these can interoperate. Nobody is going to be able to produce a commercial product or a Grid-in-a-box that can address the requirements of all Grid usage patterns, indeed much of the strength of the Grid concept is that it clearly recognizes this. The Globus team, who are the basis for the Grid building work described here, understood this very well and have produced a toolkit of sufficient flexibility and robustness to allow building of many different types of Grid. Section 3 also analyses different data usage patterns in Data Grids, and this highlights the realization in Grid computing that the distribution of data is even more important than the distribution of computing resources, since the curation and storage of data is becoming a key issue in tera- and peta- scale computing. The importance of workflow management has also come to the fore. The integration of message passing with the Grid is discussed primarily in the context of MPICH-G2, which provides access to both highly optimized vendor supplied MPI for intra-machine communication and socket based communication for the inter-machine communication. It is important that Globus, as core essential middleware, can interoperate with the best tools from anywhere in the world and a few examples of this are given.
  Section 4, "Grid Support for Collaboration," describes how the Grid Common Services promote collaboration via the mechanisms for enabling secure resource sharing in Virtual Organizations. The Access Grid has an important role in enabling the human side of such collaboration, and in the building of trust and working relationships in a VO, and is mentioned as an aside.
  Sections 5, "Building an Initial Multi-site, Computational and Data Grid," and 6, "Cross-Site Trust Management," provide an account of the detail of Grid building. The interaction of the sociology and working practices of the administrators and users of a Grid is integrated with the technical details of Grid deployment and certificate management. Some detail is provided on the building of an identity Certification Authority and the issues of interoperability that are raised here.
  
  Section 7, "Transition to a Prototype-Production Grid," fill in the essential steps necessary for Grid building. Section 7.3, The Model for the Grid Information System, describes the issue of Grid Information Service mechanisms. The strengths of the Globus model for GIS, which has been built on top of extensive practical experimentation, are set out. The tools described here give the ability to build functional and large scale Grids for particular communities. Whether tools such as X500 naming will enable the very complex Grids which can cross national borders and multiple administrative systems and lead to genuinely Global Grids, is not yet clear, but with the tools described here many very useful Grids can be built. Section 7.4, "Local Authorization," provides an account of the features of Globus mapfiles. Section 7.5, "Site Security Issues," highlights a serious issue with Globus and firewalls, namely the necessity to keep a range of ports open. Sections 7.6 - 7.9 give advice on moving towards getting real users onto the Grid, including issues such as high performance networking and batch schedulers. Section 7.11, Data Management and Your Grid Service Model, provides insights where large scale data management is an important issue (likely to be the majority of Grids). Section 7.10, "Grid Systems Administration Tools," discusses a little progress in Grid administration. Section 7.12, "Take Good Care of the Users as Early as Possible," describes some of the things that can be done to ease the transition of users in a Grid environment. This includes some detail on a proxy certificate management service (MyProxy), since experience has shown that certificate handling is one of the big barriers to consumer acceptance of Grids. MyProxy also allows the flexibility of the Globus proxy delegation model to be exploited via advanced programming models and problem solving portals.
  Section 8 provides some concluding remarks, and section 9 attempts to acknowledge the many people who have helped to make IPG and the DOE Science Girds successes.
  Section 10 is an annotated bibliography that is intended to provide pointers to a lot of additional information, and to acknowledge that there is a lot of other work going on in Grids that is only mentioned in passing in this article.
- William E. Johnston
- The NASA IPG Engineering Team, and The DOE Science Grid Team
- wejohnston@lbl.gov
- Received June 2 2002; Comments to Author June 24 2002; Accepted July 8 2002
C599: Grids and the Virtual Observatory
- Abstract:We consider several projects from astronomy that benefit from the Grid paradigm and associated technology, many of which involve either massive datasets or the federation of multiple datasets. We cover image computation (mosaicking, multi-wavelength images, and synoptic surveys); database computation (representation through XML, data mining, and visualization); and semantic interoperability (publishing, ontologies, directories, and service descriptions).
- Roy Williams
- Caltech Center for Advanced Computing Research
- roy@cacr.caltech.edu
- Received June 5 2002; Comments to Author June 24 2002; Accepted July 10 2002
C600: The Open Grid Service Architecture and Data Grids
- Abstract:Data Grids address the data intensive aspects of Grid computing, and therefore impose a very specific set of requirements on Grid Services. In this article the Data Grid problem is revisited with emphasis on Data Management. It investigates how Data Grid Services would need to be deployed within the Open Grid Services Architecture to fulfill the vision of the Data Grid
- Peter Z. Kunszt and Leanne P. Guy
- IT Division - Database Group, CERN, 1211 Geneva Switzerland
- Peter.Kunszt@cern.ch
- Received June 10 2002; Comments to Author June 24 2002; Accepted July 9 2002
C601: Peer-to-Peer Grid Databases for Web Service Discovery
- Abstract:Grids are collaborative distributed Internet systems characterized by large scale, heterogeneity, lack of central control, multiple autonomous administrative domains, unreliable components and frequent dynamic change. In such systems, it is desirable to maintain and query dynamic and timely information about active participants such as services, resources and user communities. The web services vision promises that programs are made more flexible, adaptive and powerful by querying Internet databases (registries) at runtime in order to discover information and network attached building blocks, enabling the assembly of distributed higher-level components. In support of this vision, we introduce the Web Service Discovery Architecture (WSDA), which subsumes an array of disparate concepts, interfaces and protocols under a single semi-transparent umbrella. WSDA specifies a small set of orthogonal multi-purpose communication primitives (building blocks) for discovery, covering service identification, service description retrieval, data publication as well as minimal and powerful query support. The individual primitives can be combined and plugged together by specific clients and services to yield a wide range of behaviors and emerging synergies. Based on WSDA, we introduce the hyper registry, which is a centralized database node for discovery of dynamic distributed content. It supports XQueries over a tuple set from a dynamic XML data model. We address the problem of maintaining dynamic and timely information populated from a large variety of unreliable, frequently changing, autonomous and heterogeneous remote data sources. However, in a large cross-organizational system, the set of information tuples is partitioned over many such distributed nodes, for reasons including autonomy, scalability, availability, performance and security. This suggests the use of Peer-to-Peer (P2P) query technology. Consequently, we propose the WSDA based Unified Peer-to-Peer Database Framework (UPDF) and its corresponding Peer Database Protocol (PDP). They are unified in the sense that they allow to express specific discovery applications for a wide range of data types, node topologies (e.g. ring, tree, graph), query languages (e.g. XQuery, SQL), query response modes (e.g. Routed, Direct and Referral Response), neighbor selection policies, pipelining, timeout and scope policies. We describe the first steps towards the convergence of Grid Computing, Peer-to- Peer Computing, Distributed Databases and Web Services. The uniformity and wide applicability of our approach is distinguished from related work, which (1) addresses some but not all problems, and (2) does not propose a unified framework.
- Wolfgang Hoschek
- CERN IT Division, European Organization for Nuclear Research, 1211 Geneva 23, Switzerland
- wolfgang.hoschek@cern.ch
- Received June 10 2002; Comments to Author June 28 2002; Accepted July 9 2002
C602: Unicore and the Open Grid Services Architecture
- Abstract:This paper describes the design and implementation of a GridService demonstrator, built around the Unicore grid environment. Based on the experience gained in this process, the paper discusses lessons learned about the Open Grid Services Architecture and the Grid Service Specification. It identifies several weaknesses and redundant components in the current draft of the Grid Service Specification and makes recommendations as to its further development. Finally, we provide some indication of the directions likely to be taken in further developing the Unicore infrastructure in light of the Open Grid Systems Architecture.
- David Snelling
- Fujitsu Laboratories of Europe, Hayes Park, Central Hayes End Road, Hayes, Middlesex UB4 8FE
- d.snelling@fle.fujitsu.com
- Received June 10 2002; Comments to Author June 26 2002; Accepted July 9 2002
C603: The Physiology of the Grid
- Abstract:In both e-business and e-science, we often need to integrate services across distributed, heterogeneous, dynamic "virtual organizations" formed from the disparate resources within a single enterprise and/or from external resource sharing and service provider relationships. This integration can be technically challenging because of the need to achieve various qualities of service when running on top of different native platforms. We present an Open Grid Services Architecture that addresses these challenges. Building on concepts and technologies from the Grid and Web services communities, this architecture defines a uniform exposed service semantics (the Grid service); defines standard mechanisms for creating, naming, and discovering transient Grid service instances; provides location transparency and multiple protocol bindings for service instances; and supports integration with underlying native platform facilities. The Open Grid Services Architecture also defines, in terms of Web Services Description Language (WSDL) interfaces and associated conventions, mechanisms required for creating and composing sophisticated distributed systems, including lifetime management, change management, and notification. Service bindings can support reliable invocation, authentication, authorization, and delegation, if required. Our presentation complements an earlier foundational article, "The Anatomy of the Grid," by describing how Grid mechanisms can implement a service-oriented architecture, explaining how Grid functionality can be incorporated into a Web services framework, and illustrating how our architecture can be applied within commercial computing as a basis for distributed system integration-within and across organizational domains.
- Ian Foster, Carl Kesselman, Jeffrey M. Nick, Steven Tuecke
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439; Department of Computer Science, University of Chicago, Chicago, IL 60637; Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292; IBM Corporation, Poughkeepsie, NY 12601
- foster@mcs.anl.gov
- Received June 14 2002
C604: Databases and The Grid
- Abstract:This paper examines how databases can be integrated into the Grid. Almost all early Grid applications are file-based, and so, to date, there has been relatively little effort applied to integrating databases into the Grid. However, if the Grid is to support a wider range of applications, both scientific and otherwise, then database integration into the Grid will become important. Therefore, this paper investigates the requirements of Grid-enabled databases and considers how these requirements are met by existing Grid middleware. This shows that support is very limited. The paper therefore goes on to propose a service-based architecture, and identifies the key service functionalities needed to meet the requirements. In this architecture, database systems are wrapped within a Grid-enabled service interface that simplifies the task of building applications that access their contents. The ability to federate data from multiple databases is likely to be a very powerful facility for Grid users wishing to collate and analyse information distributed over the Grid. The paper describes how the service-based framework assists in federating database servers over the Grid by supporting the creation of virtual databases that present the same service interface to applications as do the individual, unfederated, database systems.
- Paul Watson
- Department of Computing Science, University of Newcastle, Newcastle-upon-Tyne, UK
- Paul.Watson@newcastle.ac.uk
- Received June 14 2002; Comments to Author June 24 2002; Accepted July 9 2002
C606: The New Biology and the Grid
- Abstract:Biological and biochemical science has entered a new era defined by large multi-scale (from atom to organism to populations) efforts conducted by large teams with fast evolving biotechnology. A product of this change are vast amounts of data, data which are doubling every 6 months or so and which needs to be processed and turned into information and hopefully knowledge. Traditional depth computing - large jobs on a relatively small number of processors in a large shared memory environment where the tasks are tightly coupled and operate on a relatively small amount of data are being accompanied by breadth computing - relatively short calculations performed identically and independently on a very large number of data points. The grid is ideally suited for breadth computing and this chapter highlights several applications that are either using or will shortly benefit from the grid. The first compares the 3D structures of all known proteins in an effort to characterize protein fold space and illustrates the importance of attention to data organization and data flow. The second, a genomic pipeline, the Encyclopedia of Life, provides putative annotation and 3D models for an estimated 107 proteins encoded in over 800 genomes illustrates the sheer volume of processing that is needed. The third, Chemport illustrates a fully integrated system where the grid is part of an extended workflow infrastructure supporting not just high performance computation, but data access, visualization and productivity tools. The chapter concludes with an assessment of what it will take to realize the promise of the grid for this "new biology." It turns out that the issues have more to do with establishing synergy between biologists and grid experts than limitations in the technology
- Kim Baldridge and Philip E. Bourne
- San Diego Supercomputer Center and University of California San Diego
- bourne@sdsc.edu
- Received June 21 2002; Accepted July 8 2002
C607: Virtualization Services for Data Grids
- Abstract:Data Grids provide a set of virtualization services to enable management and integration of digital entities that are distributed across multiple sites and storage systems. Virtualization services include logical name spaces for assigning global, persistent identifiers, and persistency mechanisms to manage technology obsolescence. Since digital entities can be represented as combinations of data, information, and knowledge, the virtualization services provide levels of abstraction for characterizing operations on data repositories (storage systems), information repositories (databases), and knowledge repositories. This chapter provides a survey of concepts that are used for digital entity management and integration in Data Grids (Section 2). The state of the art in data grid technology is discussed, including the design of a persistent archive infrastructure, based upon the convergence of approaches across several different extant Data Grids (Section 3). Approaches to information integration are also described based on data warehousing, database integration, and semantic-based data mediation (Section 4). We conclude in Section 5 with a statement of future research challenges.
- Reagan W. Moore and Chaitan Baru
- San Diego Supercomputer Center and University of California, San Diego
- moore@sdsc.edu
- Received June 21 2002; Accepted July 8 2002
C611: Parameter Sweeps on the Grid with APST
- Abstract:Parameter sweep applications consist of large sets of independent tasks and arise in many fields of science and engineering. Due to their flexible task synchronization requirements, these applications are ideally suited to large-scale distributed platforms such as the Computational Grid. However, for users to readily benefit from such platforms, it is necessary to provide transparent application deployment and automatic application scheduling. We present here version 2.0 of the AppLeS Parameter Sweep Template (APST) software, an application execution environment which schedules and deploys large-scale parameter sweep applications on Grid platforms. We describe the main principles behind the design of APST, its current implementation, and the services and mechanisms it can leverage to deploy applications. We illustrate APST's usability and explain how its XML-based interface allows users to easily direct Grid resources to run their applications. Finally, we briefly discuss applications that are currently using APST, and we highlight future developments.
- Henri Casanova, Fran Berman
- San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Dr., La Jolla, CA 92093-0505; Computer Science and Engineering Department, University of California, San Diego, 9500 Gilman Dr., La Jolla, CA 92093-0114
- casanova@cs.ucsd.edu
- Received 28 June 2002; Accepted July 7 2002
C612: NetSolve: Past, Present, and Future; A Look at a Grid Enabled Server
- Abstract:NetSolve is grid middleware based on client-server-agent technology that enables users to solve complex scientific problems remotely. The system allows users to access both hardware and software computational resources distributed across the Grid. NetSolve searches for computational resources on the Grid, chooses the best one available, and solves a problem using retry for fault-tolerance, and returns the answers to the user. NetSolve binds grid systems and problem-solving environments, while allowing the flexibility for a user to write their own front-end or embed a call to NetSolve using a C or Fortran language client API. This paper examines the current system, some of the applications using NetSolve, and future directions for the project.
- Sudesh Agrawal, Jack Dongarra, Keith Seymour, and Sathish Vadhiyar
- University of Tennessee
- dongarra@cs.utk.edu
- Received July 1 2002; Accepted July 7 2002
C613: The Data Deluge: An e-Science Perspective
- Abstract:This paper previews the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites. In order to be exploited by search engines and data mining software tools, such experimental data needs to be annotated with relevant metadata giving information as to provenance, content, conditions and so on. The paper argues the case for creating new types of digital libraries for scientific data with the same sort of management services as conventional digital libraries in addition to other data-specific services. Some likely implications of the Open Archive Initiative and of e-Science data for the future role for university libraries is then briefly discussed. A substantial subset of this e-Science data needs to archived and curated for long-term preservation. Some of the issues involved in the digital preservation of both scientific data and of the programs needed to interpret the data are then reviewed. Finally, the implications of this wealth of e-Science data for the Grid middleware infrastructure are highlighted.
- Tony Hey and Anne Trefethen
- EPSRC, Polaris House, North Star Avenue, Swindon SN2 1 ET, UK; Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK
- Tony.Hey@epsrc.ac.uk
- Received July 2 2002, Accepted July 11 2002
C615: Commodity Grid Kits - Middleware for Building Grid Computing Environments
- Abstract:Recent Grid projects, such as the Globus Project, provide a set of useful services such as authentication and remote access to resources, and information services to discover and query such remote resources. Unfortunately, these services may not be compatible with the commodity technologies used for application development by the software engineers and scientists. Instead, users may prefer accessing the Grid from a higher level of abstraction than what such toolkits provide. To bridge this gap, Commodity Grid (CoG) Kits provide the middleware for accessing the functionality of the Grid from a variety of commodity technologies, frameworks, and languages. It is important to recognize that these Commodity Grid Kits not only provide an interface to existing Grid technologies, but also bring Grid programming to a new level by leveraging the methodologies of the chosen commodity technology, thus helping the development of the next generation of Grid services. Based on these Commodity Grid Toolkits, a variety of higher level Grid services are far easier to design, maintain, and deploy. Several projects have successfully demonstrated the use of Commodity Grid Kits for the design of advanced Grid Services and Grid Computing Environments.
- Gregor von Laszewski, Jarek Gawor, Sriram Krishnan and Keith Jackson
- Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, U.S.A., Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Berkeley, CA 94720, U.S.A, Indiana University, 150 S. Woodlawn Ave., Bloomington, IN 47405, U.S.A.
- gregor@mcs.anl.gov
- Received July 9 2002, Accepted July 9 2002
C616: The Evolution of the Grid
- Abstract:In this paper we describe the evolution of grid systems, identifying three generations: first generation systems which were the forerunners of the Grid as we recognise it today; second generation systems with a focus on middleware to support large scale data and computation; and third generation systems where the emphasis shifts to distributed global collaboration, a service oriented approach and information layer issues. In particular, we discuss the relationship between the Grid and the World Wide Web, and suggest that evolving web technologies will provide the basis for the next generation of the Grid. The latter aspect - which we define as the Semantic Grid - is explored in a companion paper.
- David De Roure, Mark A. Baker, Nicholas R. Jennings and Nigel R. Shadbolt
- Universities of Portsmouth and Southampton
- dder@ecs.soton.ac.uk
- Received July 8 2002; Accepted July 9 2002
C617: NaradaBrokering: An Event Based Infrastructure for Building Scaleable Durable Peer-to-Peer Grids
- Abstract:We propose an architecture for building a scaleable durable P2P grid comprising resources such as relatively static clients, high-end resources and a dynamic collection of multiple P2P subsystems. Clients in such systems must be linked together in a flexible fault tolerant efficient high performance fashion. We investigate the architecture, comprising a distributed brokering system that will support such a hybrid environment. In this paper, we study the event brokering system - NaradaBrokering - that will support such a hybrid environment linking clients (both users and resources of course) together. Keywords: Event distribution systems, middleware, P2P systems, grid computing, durable messaging
- Geoffrey Fox and Shrideep Pallickara
- PTLIU Labs for Community Grid Computing, Indiana University
- spallick@indiana.edu
- Received July 8 2002; Accepted July 9 2002
C618: Grid Programming Models: Current Tools, Issues and Directions
- Abstract: Grid programming must manage computing environments that are inherently parallel, distributed, heterogeneous and dynamic, both in terms of the resources involved and their performance. Furthermore, grid applications will want to dynamically and flexibly compose resources and services across that dynamic environments. While it may be possible to build grid applications using established programming tools, they are not particularly well-suited to effectively manage flexible composition or deal with heterogeneous hierarchies of machines, data and networks with heterogeneous performance. This chapter discusses issues, properties and capabilities of grid programming models and tools to support efficient grid programs and their effective development. The main issues are outlined and then current programming paradigms and tools are surveyed, examining their suitability for grid programming. Clearly no one tool will address all requirements in all situations. However, paradigms and tools that can incorporate and provide the widest possible support for grid programming will come to dominant. Advanced programming support techniques are analyzed discussing possibilities for their effective implementation on grid environments.
- Craig Lee and Domenico Talia
- Computer Systems Research Department, The Aerospace Corporation, P.O. Box 92957, El Segundo, CA USA 87036; DEIS, Universit della Calabria Rende, CS Italy
- lee@aero.org
- Received July 5 2002; Accepted July 7 2002
C619: Ninf-G: a GridRPC system on the Globus Toolkit
- Abstract:We describe here a GridRPC programming system implemented on top of the Globus Toolkit, called Ninf-G. GridRPC systems enable easy implementation of parallel applications on the Grid by providing simple programming RPC-style, task-parallel, and largely Grid-transparent interfaces, and serve as middleware that glues together Grid applications and the lower-level Grid substrates such as Globus. We overview the GridRPC model of Grid programming, and its implementation with Ninf-G, including the client side API and the server side IDL, as well as its typical use for Grid programming. We perform preliminary evaluation in both WAN and LAN environment, demonstrating that the overhead of Ninf-G is reasonable after the first-time overhead of GSI authentication is tolerated.
- Hidemoto Nakada, Yoshio Tanaka, Satoshi Matsuoka, Staoshi Sekiguchi
- National Institute of Advanced Industrial Science and Technology, Grid Technology Research Center, Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, JAPAN; Tokyo Institute of Technology, Global Scientific Information and Computing Center, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, JAPAN
- hide-nakada@aist.go.jp
- Received July 9 2002; Accepted July 10 2002
C620: Autonomic Computing and GRID
- Abstract:Modern servers face a serious design challenge. Since a server tends to be a conglomeration of several distributed services, the complexity of composing a server grows very rapidly. In addition, each component in the system faces unpredictable variability in its inputs and demands for quality of service. Unless a component can react quickly to such variations, it will not be able to perform well. Furthermore, as systems grow and become more and more sophisticated, one invariably ends up in a distributed and heterogeneous environment. This makes it impractical to have a centralized control that monitors input variability and adjusts resources.
  To face these challenges, the servers are being modularly designed, with each component providing a standard interface3. Components are no longer expected to provide a rigid and deterministic functional behavior in all aspects of their interactions with other components. They are expected to efficiently deal with the varying behaviors of other components, including faults. As centralized controls become infeasible, components must be prepared to perceive changes and to negotiate the exchange of resources on a voluntary basis. In summary, components must be self-governing, self-organizing, self-stabilizing and self-healing in the surrounding world of unpredictable components. The attribute, autonomic is used to capture this design philosophy. An essential design feature in modern servers is to make each component autonomic. This is important in order to contain the system-level complexity. While each component is designed to cope with the idiosyncrasies of other interacting components, we must also ensure that the system as a whole coordinates itself to achieve the system-level goals. While there are many subsystems in the current servers that are self-organizing to various degrees, it is necessary to formulate a common framework for every autonomic component and to develop general principles for their stability.
  The Grid Computing attempts to provide a common platform in which controlled interactions can take place in a distributed, heterogeneous environment and shares many of the objectives with regard to interactions with other components. Discovery of other services, negotiating terms for services from other components, monitoring the service quality rendered and adjusting the negotiations are some examples. The Grid platform attempts to standardize the protocols needed for some of these transactions and may also provide some tools to implement certain flavors of them.
  In this paper, we give an operational definition of an autonomic component of a server and identify the key features it must possess to be effective in the emerging environment. We then examine a scenario in the grid environment and argue that design of subsystems within the grid environment must naturally face the concerns addressed by autonomic designs. Hence, there is synergy between these two perspectives and one stands to gain by blending them together.
- Pratap Pattnaik, Kattamuri Ekanadham and Joefon Jann
- T. J. Watson Res. Center, Yorktown Heights, NY 10598
- pratap@us.ibm.com
- Received July 9 2002; Accepted July 10 2002
C621: Peer-to-Peer Grids
- Abstract:We describe Peer-to-Peer Grids built around Integration of technologies from the peer-to-peer and Grid fields. We focus on the role of Web services linked by a powerful event service using uniform XML interfaces and application level routing. We describe how a rich synchronous and asynchronous collaboration environment can support virtual communities built on top of such infrastructure. Universal access mechanisms are discussed.
- Geoffrey Fox, Dennis Gannon, Sung-Hoon Ko, Sangmi Lee, Shrideep Pallickara, Marlon Pierce, Xiaohong Qiu, Xi Rao Ahmet Uyar, Minjun Wang, Wenjun Wu
- Pervasive Technology Laboratories Indiana University
- gcf@indiana.edu
- Received July 7 2002; Accepted July 10 2002
C622: Grid Web Services and Application Factories
- Abstract:This paper describes an implementation of a Grid Application Factory Service that is based on a component architecture that utilizes the emerging Web Services standards. The factory service is used by Grid clients to authenticate and authorize a user to configure and launch an instance of a distributed application. This helps us solve the problem of building reliable, scalable Grid applications, by separating the process of deployment and hosting from application execution. The paper also describes how these component-based applications can be made compatible with the Open Grid Service Architecture(OGSA) and how OGSA concepts enhance the usability of the component framework.
- Dennis Gannon, Rachana Ananthakrishnan, Sriram Krishnan, Madhusudhan Govindaraju Lavanya Ramakrishnan, Aleksander Slominski
- Department of Computer Science Indiana University
- gannon@cs.indiana.edu
- Received 11 July 2002, Accepted July 11 2002
C623: The Grid Portal Development Kit
- Abstract:Computational science portals are emerging as useful and necessary interfaces for performing operations on the Grid. The Grid Portal Development Kit (GPDK) facilitates the development of Grid portals and provides several key reusable components for accessing various Grid services. A Grid Portal provides a customizable interface allowing scientists to perform a variety of Grid operations including remote program submission, file staging, and querying of information services from a single, secure gateway. The Grid Portal Development Kit leverages off existing Globus/Grid middleware infrastructure as well as commodity web technology including Java Server Pages and servlets. The design and architecture of GPDK is presented as well as a discussion on the portal building capabilities of GPDK allowing application developers to build customized portals more effectively by reusing common core services provided by GPDK.
- Jason Novotny
- Lawrence Berkeley National Laboratory
- Email: JDNovotny@lbl.gov
- Received 29 June 2001; Revised 30 November 2001; Accepted January 6 2002
C625: Distributed object-based grid computing environments
- Abstract:We review the basic architectures and services of the Gateway and Mississippi Computational Web Portals. These portals are designed to provide seamless access to remote software, hardware, and data through the user's web browser. This is accomplished in both cases by implementing an architecture that follows the classic three-tiered design, with user environment and backend resources separated by a middle control layer that is implemented as a set of distributed objects. These middle tier objects act as service proxies that wrap the backend resources. In this paper we review basic and advanced portal services, describe our use of application metadata, and examine security requirements for three-tiered architectures. Finally, we discuss future directions for portal development, considering the impact of Web services and portlet technologies.
- Tomasz Haupt, Marlon E. Pierce
- Engineering Research Center, Mississippi State University, Starkville, MS 39762; Community Grids Lab, Indiana University, Bloomington, IN 47404
- marpierc@indiana.edu
- Received 11 July 2002, Accepted July 11 2002
C626: Storage Manager and File Transfer Web Services
- Abstract:Web services are emerging as an interesting mechanism for a wide range of grid services, particularly those focused upon information services and control. When coupled with efficient data transfer services, they provide a powerful mechanism for building a flexible, open, extensible data grid for science applications. In this paper we present our prototype work on a Java Storage Resource Manager (JSRM) web service and a Java Reliable File Transfer (JRFT) web service. A java client (Grid File Manager) on top of JSRM and JRFT is developed to demonstrate the capabilities of these web services. The purpose of this work is to show the extent to which SOAP based web services are an appropriate direction for building a grid-wide data management system, and eventually grid-based portals.
- William A. Watson, Ying Chen, Jie Chen, Walt Akers
- Thomas Jefferson National Accelerator Facility Newport News, Virginia 23606, U.S.A.
- Chip.Watson@jlab.org
- Received 11 July 2002, Accepted July 11 2002
C627: From Legion to Avaki: The Persistence of Vision
- Abstract:Grids have metamorphosed from academic projects to commercial ventures. Avaki, a leading commercial vendor of Grids, has its roots in Legion, a Grid project at the University of Virginia begun in 1993. In this chapter, we present fundamental challenges and requirements for Grid architectures that we believe are universal, our architectural philosophy in addressing those requirements, an overview of Legion as used in production systems and a synopsis of the Legion architecture and implementation. We also describe the history of the transformation from Legion - an academic, research project - to Avaki, a commercially supported, marketed product. Several of the design principles as well as the vision underlying Legion have continued to be employed in Avaki. As a product sold to customers, Avaki has been made more robust, more easily manageable and easier to configure than Legion, at the expense of eliminating some features and tools that are of less immediate use to customers. Finally, we place Legion in the context of OGSI, a standards effort underway in Global Grid Forum.
- Andrew S. Grimshaw, Anand Natrajan, Marty A. Humphrey, Michael J. Lewis, Anh Nguyen-Tuong, John F. Karpovich, Mark M. Morgan, Adam J. Ferrari
- University of Virginia, Avaki Corporation
- agrimshaw@avaki.com
- Received 11 July 2002, Accepted July 11 2002
C628: Classifying and enabling grid applications
- Abstract:Todays applications need the functionality of the Grid to break free of single resource limitations, and in turn, the Grid needs applications in order to properly evolve. Why then are there currently so few applications using grids? We describe some of the problems faced by application developers in moving to the Grid, and show how grid application frameworks should overcome these difficulties. Such frameworks will define the relationship between the Grid and applications, providing consistent abstract interfaces to grid operations and allowing applications to include these operations independently from their actual implementation. These user interfaces should be application focused, capturing the semantics of the underlying operations, then driving grid development to support these needs. Building the right interfaces motivates a detailed classification of applications and operations for the grid, and we provide a simple taxonomy in this paper, outlining important new directions in which it should be extended. We describe the rationale and design of a Grid Application Toolkit in the GridLab project, which will provide a comprehensive language and implementation neutral framework for encapsulating grid operations that will greatly simplify and accelerate future grid application development.
- Gabrielle Allen, Tom Goodale, Michael Russell, Edward Seidel and John Shalf
- Max-Planck-Institut fur Gravitationsphysik, Golm, Germany; Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- eseidel@aei.mpg.de
- Received 12 July 2002, Accepted July 12 2002
C629: Building Grid Computing Portals: The NPACI Grid Portal Toolkit
- Abstract:Portals have become established as effective interfaces for enabling scientific users to access information and services in Grid computing environments, as demonstrated by the success of the NPACI HotPage and numerous other Grid computing portals. Grid portals hide the complexities of Grid technologies from the user and present simplified, intuitive interfaces for harnessing the power of the underlying resources. Grid portal toolkits that interface to Grid technologies have proven useful, enabling developers to rapidly build portals. The NPACI Grid Portal Toolkit (GridPort) facilitates development and utilization of Grid technologies such as the Globus Toolkit and the Storage Resource Broker from within an integrated, unified API. GridPort supports a set of centralized services that allow multiple application portals to share services and a single-login environment. The advent of new technologies such as Grid Web services has created new opportunities for making Grid portal toolkits that are even simpler for developers and GridPort is being enhanced to include these new technologies. New portal technologies such as customization and portlets are being integrated into toolkits such as GridPort to make it possible to develop portals that allow users to meet their personal needs more effectively. The impact of these new technologies will result in increased development, utilization, and more effective use of Grid portals. In this paper we describe GridPort, the HotPage, and GridPort-based application portals, as well as the next version of GridPort that integrates Grid Web services and supports a variety of Grid Computing Environments.
- Mary Thomas and Jay Boisseau
- Texas Advanced Computing Center, The University of Texas at Austin
- mthomas@tacc.utexas.edu
- Received August 2 2002, Accepted August 2 2002
C630: DISCOVER: A Computational Collaboratory for Interactive Grid Applications
- Abstract:The growth of the Internet and the advent of the computational Grid have made it possible to develop and deploy advanced services and computational collaboratories on the Grid. These systems build on high-end computational resources and communication technologies, and enable seamless and collaborative access to resources, applications and data. In this chapter we present an overview of the DISCOVER computational collaboratory for enabling interactive applications on the Grid. Its primary goal is to bring large distributed Grid applications to the scientists’/engineers’ desktop and enable collaborative application monitoring, interaction and control. DISCOVER is composed of 3 key components: (1) a middleware substrate that integrates DISCOVER servers and enables interoperability with external Grid services, (2) an application control network consisting of sensors, actuators, and interaction agents that enable monitoring, interaction and steering of distributed applications, and (3) detachable portals for collaborative access to grid applications and services. The design, implementation, operation and evaluation of these components is presented.
- V. Mann and M. Parashar
- The Applied Software Systems Laboratory, Department of Electrical and Computer Engineering, Rutgers, The State University of New Jersey, 94 Brett Road, Piscataway, NJ 08854.
- parashar@caip.rutgers.edu
- Received 13 July 2002, Accepted July 13 2002
C631: Combinatorial Chemistry and the Grid
- Abstract:Chemistry has always made extensive use of the developing computing technology and available computing power though activities such as modelling, simulation and chemical structure interpretational; activities conveniently summarised as computational chemistry. Developing procedures in chemical synthesis and characterisation, particularly in the arena of parallel and combinatorial methodology, have generated ever increasing demands on both Computational Chemistry and Computer Technology. However, significantly the way in which networked services are being conceived to assist collaborative research pushes the use of data acquisition, remote interaction & control, computation, and visualisation, well beyond the traditional computational chemistry programmes, towards the basic issue of handling Chemical information and knowledge. The rate at which new chemical data can now be generated in Combinatorial and Parallel synthesis and screening processes, means that the data can only realistically be handled efficiently by increased automation of the data analysis as well as the experimentation and collection. Without this automation we run the risk of generating information without the ability to understand it.
- J. G. Frey, M. Bradley, J. W. Essex, M.B. Hursthouse, S. M. Lewis, M. M. Luck, L. A.V.M. Moreau, D.C. De Roure, M. Surridge, A. H. Welsh
- Department of Chemistry, Department of Electronics & Computer Science, and Department of Mathematics; University of Southampton, Southampton, SO17 1BJ, UK
- j.g.frey@soton.ac.uk
- Received 15 July 2002, Accepted July 15 2002
C632: Data Intensive Grids for High Energy Physics
- Abstract:The major high energy physics experiments of the next twenty years will break new ground in our understanding of the fundamental interactions, structures and symmetries that govern the nature of matter and space-time. Among the principal goals are to find the mechanism responsible for mass in the universe, and the "Higgs" particles associated with mass generation, as well as the fundamental mechanism that led to the predominance of matter over antimatter in the observable cosmos.
  The largest experiments preparing for CERN's Large Hadron Collider (LHC) program, each encompass 2000 physicists from 150 institutions in more than 30 countries. These experiments plan to collect and analyse several Petabytes of data (1 PB = 10^15 Bytes) in the first year of operation. Indeed, the current generation of operational experiments at SLAC (BaBar) and Fermilab (D0 and CDF), as well as the experiments at the Relativistic Heavy Ion Collider (RHIC) program at BNL, are beginning to face similar challenges. BaBar in particular has already accumulated datasets approaching a Petabyte.
  Collaborations on the global scale of the LHC experiments would not have been attempted if physicists could not plan on excellent networks: to interconnect the physics groups throughout the lifecycle of the experiment, and to make possible the construction of Data Grids capable of providing access, processing and analysis of massive datasets. These datasets will increase in size from Petabytes to Exabytes (1 EB = 10^18 Bytes) within the next decade.
  In this Chapter, we explore the computing challenges posed by the latest and next generations of HEP experiments, and describe the various Grid projects that have been initiated by the HEP community in response. Some historically significant projects are reviewed, followed by an outline of each of the major HEP-related Grid projects currently underway. Selected examples of how Grid technology is being actively used in the experiments are presented. This is followed by an analysis of the distinct differences between "Classical" and "HEP" Grids, and the very significant roles that R&D in wide area networking and global systems modelling and simulation have to play. To illustrate some current research work, we describe the MONALISA architecture for a service-oriented Grid system for HEP computing task monitoring, scheduling and control. We continue by examining the use of Grid technology for HEP analysis tasks. The Grid Enabled Analysis environment (GAE), which makes extensive use of Web Services, is introduced and described.
  In the summary, we suggest various ways in which meeting the HEP computing challenges will benefit future networks and society.
- Julian J. Bunn, Harvey B Newman
- California Institute of Technology, Pasadena, CA 91125, USA
- julian@cacr.caltech.edu
- Received 19 July 2002, Accepted July 20 2002
C633: Rationale for Developing with the Open Grid Services Architecture
- Abstract:The UK e-Science Core Programme established an Architectural Task Force to map out the UK's policy regarding grid architectures. A decision had already been taken to work closely with the Globus team and to use Globus Toolkit Version 2. The emergence of the Open Grid Services Architecture proposal in late 2001 was greeted warmly as a well-chosen direction that would support UK e-Science requirements and provide opportunities for the UK e Science projects to contribute to future grid infrastructure. Early contributions are expected to include database access and integration, and grid markets. A longer-term goal is to lift the level of discourse when integrating software and data components, and to automate more of the routine work of assembling and operating e-Science applications and infrastructure. This paper is based on the first report of that task force, suggesting a road map that was accepted by the UK e-Science community in April 2002.
- Malcolm Atkinson
- The UK e-Science Architecture Task Force
- mpa@dcs.gla.ac.uk
- Received 20 July 2002, Revised August 7 2002, Accepted August 7 2002
C634: eDiamond: a grid-enabled federated database of annotated mammograms
- Abstract: This chapter introduces a project named eDiamond which aims to develop a grid-enabled federated database of annotated mammograms, built at a number of sites (initially in the UK), and which ensures database consistency and reliable image processing. A key feature of eDiamond is that images are "standardised" prior to storage. We describe what this means, and why it is a fundamental requirement for numerous grid applications, particularly in medical image analysis, and especially in mammography. The eDiamond database will be developed with two particular applications in mind: teaching and supporting diagnosis. There are several other applications for such a database, which are the subject of related projects. We also discuss the ways in which information technology (IT) is impacting on the provision of health care - a subject which in Europe is called Healthcare Informatics. We outline some of the issues concerning medical images, and describe mammography as an important special case. We discuss medical image databases, as a prelude to the description of the eDiamond e-science project. We relate the eDiamond project to a number of other efforts currently underway, most notably the US NDMA project.
- Michael Brady, David Gavaghan, Andrew Simpson, Miguel Mulet Parada, Ralph Highnam
- Medical Vision Laboratory, Department of Engineering Science, Oxford University; Computing Laboratory, Wolfson Building, Parks Road, Oxford; Mirada Solutions Limited, Oxford Centre for Innovation, Mill Street, Oxford
- jmb@robots.ox.ac.uk
- Received 31 July 2002, Accepted July 31 2002
C644: Condor and the Grid
- Abstract:Since 1984, the Condor project has helped ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational grid. In this chapter, we describe the history and philosophy of the Condor project and describe how it has evolved along with the field of distributed computing. We describe many of the positive interactions between Condor and other computing projects and discuss how the transfer of technologies and ideas benefits everyone. We discuss the distinction between planning and scheduling in an uncertain environment and the need to construct computing communities that reflect underlying human organizations. Throughout, we reflect on the lessons of experience, discussing techniques that failed alongside those that succeeded
- Douglas Thain, Todd Tannenbaum, and Miron Livny
- Dept of Computer Sciences, Univ of Wisconsin-Madison, 1210 W. Dayton St. Madison, WI 53705
- tannenba@cs.wisc.edu
- Received August 26 2002, Accepted August 26 2002
C645: Education and the Enterprise with the Grid
- Abstract: The Grid enables collaboration between people and electronic resources around the globe. This technology casn be applied to education and broad information dissemination in a way that will bring the possibility of quality education and information for all. we discuss the implications for enterprise (education) systems -- what should you do to enable and enhance the integration of Grid systems; the virtual university -- what does this need and what might it look like; re-usable information -- the promise of a universal service architecture to greater re-useability of (curriculum) components.
- Geoffrey Fox
- Community Grids Laboratory, 501 N. Morton St Suite 224, Bloomington IN 47404-3730
- gcf@indiana.edu
- Received August 20 2002, Accepted September 7 2002
C646: Overview of Book
- Abstract:This short preamble gives an overview of the whole book with a special emphasis on the architecture section. It is designed to help reader plan their use of the book.
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC, Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Recived August 11 2002, Accepted September 24 2002
C647: The Grid: Past, Present, Future
- Abstract:This describes why the Grid is important and its technologies and applications to Science and the Enterprise.In section 2 of this chapter, we highlight some historical and motivational building blocks of the Grid - described in more detail in chapter 3. Section 3 describes the current community view of the Grid with its basic architecture. Section 4 contains four building blocks of the Grid. In particular section 4.1, we review the evolution of networking infrastructure including both the desktop and cross continental links, which are expected to reach Gigabit and Terabit performance respectively over next 5 years. Section 4.2 presents the corresponding computing backdrop with 1 to 40 teraflop performance today moving to petascale system at the end of the decade. We use the new NSF TeraGrid as the HPCC (High Performance Computing and Communications) highlight today. Section 4.3 summarizes many of the regional, national and international activities designing and deploying Grids. Standards, covered in section 4.4 are a different but equally critical building block of the Grid. Section 5 covers the critical area of applications on the Grid covering life sciences, engineering and the physical sciences. We highlight new approaches to Science including the importance of collaboration and the e-Science concept driven partly by increased data. A short section on commercial applications includes the e-Enterprise/Utility concept of computing power on demand. Applications are summarized in Section 5.7, which discusses the characteristic features of "good Grid" applications like the one shown in Figure 2. This shows an instrument (the photon source at Argonne National Laboratory) linked to computing, data archiving and visualization facilities in a local Grid. Part D and chapter 35 of the book describe this in more detail. Futures are covered in Section 6 with the intriguing concept of autonomic computing developed originally by IBM covered in section 6.1 and chapter 13. Section 6.2 is a brief discussion of Grid programming covered in depth in chapter 20 and part C of the book. There are concluding remarks in sections 6.3 to 6.5. We stress that this chapter is designed to give the high level motivation for the book; it is not a complete review. In particular there only a few general references to many key subjects because the later chapters of the book and its associated web site will give you these. Further although book chapters are referenced, this is not a reader's guide to the book, which is given in fact in the preceding preface. Further chapters 20 and 35 are guides to parts C and D of the book. The reader will find in another box possibly useful comments on parts A and B of this book. This overview is based on presentations by Berman and Hey, conferences and a collection of presentations from Indiana University on networking .
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC, Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Received August 27 2002; Accepted September 28 2002
C648: Overview of Grid Computing Environments
- Abstract: This short chapter summarizes the current status of Grid Computatational and Programming environments. It puts the corresponding section of the book in context and integrates into a survey a set of 28 papers gathered together by the GCE (Grid Computing Environment) group of the Global Grid Forum. This set is published in 2002 as a special issue of Concurrency and Computation: Practice and Experience.
- Geoffrey Fox, Dennis Gannon, Mary Thomas
- Community Grids Laboratory Indiana University; Department of Computer Science, 215 Lindley Hall,150 S. Woodlawn Ave,. Bloomington IN 47405-7104; Texas Advanced Computing Center, The University of Texas at Austin, 10100 Burnet Road, Austin, Texas 78758
- gcf@indiana.edu
- Received August 6 2002; Accepted September 7 2002
C649: Application Overview for the Book: Grid Computing: Making the Global Infrastructure a Reality
- Abstract:This book, Grid Computing: Making the Global Infrastructure a Reality, is divided into four parts. This short chapter introduces the last part D on applications of the Grid and is not designed to review all applications but rather guide the reader interested in general principles on or particular examples of applications that are using or could use the Grid. All chapters have material relevant for Grid applications but in this part the focus the application itself. There are some of the previous chapters, which in fact cover applications as part of an overview or to illustrate a technological issue. We will in particular discuss chapters 1, 2, 11, 12, 16, 23, 24, 28, 30 and 33 in parts A, B and C below. This is in addition to chapters 37-43 devoted to applications in this part.
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC, Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Received August 27 2002; Accepted September 24 2002
C654: Metacomputing
- Abstract: This article from 1992 describes the vision of metacomputing wshich was one of the early harbingers of the Grid ten years later. The metacomputer is defined as a network of heterogeneous, computational resources linked by software in such a way that they can be used as easily as a personal computer.
  This article looks at the three stages of metacomputing, beginning with the local area metacomputer at the National Center for Supercomputing Applications (NCSA) as an example of the first stage. The capabilities to be demonstrated in the SIGGRAPH'92 Showcase environment represent the beginnings of the second stage in metacomputing. This involves advanced user interfaces that allow for participatory computing as well as examples of capabilities that would not be possible without the underlying stage one metacomputer. The third phase, a national metacomputer, is on the horizon as these new capabilities are expanded from the local metacomputer out onto Gbit/sec network testbeds.
  Several examples are given that have striking similarities to those exploiting grids in 2002.
- Larry Smarr, Charles E. Catlett
- National Center for Supercomputing Applications NCSA, 605 East Springfield Ave., Champaign, IL 61820
- catlett@mcs.anl.gov
- Reprint of L. Smarr and C. E. Catlett. Metacomputing. Communications of the ACM, 35(6):44--52, June 1992
C656: Grid Resource Allocation and Control Using Computational Economies
- Abstract: In this paper, we investigate G-commerce --- computational economies for controlling resource allocation in Computational Grid settings. We define hypothetical resource consumers (representing users and Grid-aware applications) and resource producers (representing resource owners who ``sell'' their resources to the Grid). We then measure the efficiency of resource allocation under two different market conditions: commodities markets and auctions. We compare both market strategies in terms of price stability, market equilibrium, consumer efficiency, and producer efficiency. Our results indicate that commodities markets are a better choice for controlling Grid resources than previously defined auction strategies.
- Rich Wolski, Todd Bryan, James Plank, John Brevik
- Computer Science Department, University of California, Santa Barbara Santa Barbara, CA 93106; Computer Science Department, University of Tennessee, Knoxville Knoxville, TN 37996; Mathematics Department, Wheaton College, Wheaton, Mass.
- rich@cs.ucsb.edu
- Recieved September 3 2002; Accepted September 3 2002
C657: Architecture of an Commercial Enterprise Desktop Grid: The Entropia System
- Abstract: Distributed Computing, the exploitation of idle cycles on desktop PC systems, offers the opportunity to increase the available computing power by orders of magnitude (10x - 1000x). However, for desktop PC distributed computing to be widely accepted within the enterprise, the systems must achieve high levels of efficiency, robustness, security, scalability, unobtrusiveness, and manageability. We describe the Entropia Distributed Computing System as a case study, detailing its internal architecture and philosophy in attacking these key problems. Key aspects of the Entropia system include the use of: 1) scalable web/database technology for system management, 2) application namespaces for machines, 3) binary sandboxing technology for security and unobtrusiveness, and 4) open integration model to allow applications from many sources to be incorporated. The Entropia system has been deployed in a wide range of commercial environments and used for a range of applications from areas including bioinformatics, molecular modeling, and monte carlo financial simulations.
- Andrew A. Chien
- Entropia, Inc 10145 Pacific Heights, Suite 800 San Diego, CA 92121 and University of California, San Diego Department of Computer Science and Engineering 9500 Gilman Drive La Jolla, CA 92093
- achien@cs.ucsd.edu
- Recieved September 3 2002; Accepted September 3 2002