Grid Computing: Making the Global Infrastructure a Reality
- This is set of abstracts of a collection of articles "Grid
Computing: Making the Global Infrastructure a Reality " edited by Fran
Berman, Geoffrey Fox and Tony Hey. This is a book (over 1000 pages) published
March 2003 by Wiley
- Overview
Chapters in order received
See Overview for list in
order to appear in book
- C590: The Semantic Grid: A Future
e-Science Infrastructure
- Abstract:e-Science offers a promising vision of how
computer and communication technology can support and enhance the scientific
process. It does this by enabling scientists to generate, analyse, share and
discuss their insights, experiments and results in an effective manner. The
underlying computer infrastructure that provides these facilities is commonly
referred to as the Grid. At this time, there are a number of grid applications
being developed and there is a whole raft of computer technologies that provide
fragments of the necessary functionality. However there is currently a major
gap between these endeavours and the vision of e-Science in which there is a
high degree of easy-to-use and seamless automation and in which there are
flexible collaborations and computations on a global scale. To bridge this
practice-aspiration divide, this paper presents a research agenda whose aim is
to move from the current state of the art in e-Science infrastructure, to the
future infrastructure that is needed to support the full richness of the
e-Science vision. Here the future e-Science research infrastructure is termed
the Semantic Grid (Semantic Grid to Grid is meant to connote a similar
relationship to the one that exists between the Semantic Web and the Web). In
particular, we present a conceptual architecture for the Semantic Grid. This
architecture adopts a service-oriented perspective in which distinct
stakeholders in the scientific process, represented as software agents, provide
services to one another, under various service level agreements, in various
forms of marketplace. We then focus predominantly on the issues concerned with
the way that knowledge is acquired and used in such environments since we
believe this is the key differentiator between current grid endeavours and
those envisioned for the Semantic Grid.
- David De Roure, Nicholas Jennings and Nigel Shadbolt
- Department of Electronics and Computer Science, University of
Southampton, Southampton SO17 1BJ, UK
- dder@ecs.soton.ac.uk
- Received January 23 2002, Comments to Authors May 29 2002;
Accepted July 8 2002
- C598: Implementing Production Grids
- Abstract: Starting from Section 2, "The Grid Context," we
lay out our view of a Grid architecture, and this definition provides a
structure for the subsequent detailed description. In particular we identify
what it is that differentiates a Grid from other structures for distributed
computing, e.g. hierarchical clusters. The question of what is a minimum set of
Grid services - the Grid Common Services, the neck of the hourglass model - and
what needs to be added to make the Grid usable for particular communities is
stated. Issues of interoperability and heterogeneity are addressed, and these
are perhaps the most important distinguishing features of a Grid.
Section 3, "The Anticipated Grid Usage Model Will Determine
What Gets Deployed, and When," addresses the question of Grid building from the
points of view of various different types of Grid usage. This is an important
point because differing usage patterns require different middleware, this is
why the distinction of a minimal common set of Grid services and tools is so
important. The underlying case studies have a supercomputing background, and so
attention is given to the problems of coupling and synchronicity of resources
that are not required in other sorts of Grids, e.g. Data Grids and Grids based
on the seti@home concept (e.g. Entropia). This is why interoperability is so
important, different usages of Grids will result in different middleware,
scheduling strategies and tools for collaboration. The work of the Global Grid
Forum is vital in ensuring that standards are defined so that these can
interoperate. Nobody is going to be able to produce a commercial product or a
Grid-in-a-box that can address the requirements of all Grid usage patterns,
indeed much of the strength of the Grid concept is that it clearly recognizes
this. The Globus team, who are the basis for the Grid building work described
here, understood this very well and have produced a toolkit of sufficient
flexibility and robustness to allow building of many different types of Grid.
Section 3 also analyses different data usage patterns in Data Grids, and this
highlights the realization in Grid computing that the distribution of data is
even more important than the distribution of computing resources, since the
curation and storage of data is becoming a key issue in tera- and peta- scale
computing. The importance of workflow management has also come to the fore. The
integration of message passing with the Grid is discussed primarily in the
context of MPICH-G2, which provides access to both highly optimized vendor
supplied MPI for intra-machine communication and socket based communication for
the inter-machine communication. It is important that Globus, as core essential
middleware, can interoperate with the best tools from anywhere in the world and
a few examples of this are given.
Section 4, "Grid Support for Collaboration," describes how
the Grid Common Services promote collaboration via the mechanisms for enabling
secure resource sharing in Virtual Organizations. The Access Grid has an
important role in enabling the human side of such collaboration, and in the
building of trust and working relationships in a VO, and is mentioned as an
aside.
Sections 5, "Building an Initial Multi-site, Computational
and Data Grid," and 6, "Cross-Site Trust Management," provide an account of the
detail of Grid building. The interaction of the sociology and working practices
of the administrators and users of a Grid is integrated with the technical
details of Grid deployment and certificate management. Some detail is provided
on the building of an identity Certification Authority and the issues of
interoperability that are raised here.
Section 7, "Transition to a Prototype-Production Grid,"
fill in the essential steps necessary for Grid building. Section 7.3, The Model
for the Grid Information System, describes the issue of Grid Information
Service mechanisms. The strengths of the Globus model for GIS, which has been
built on top of extensive practical experimentation, are set out. The tools
described here give the ability to build functional and large scale Grids for
particular communities. Whether tools such as X500 naming will enable the very
complex Grids which can cross national borders and multiple administrative
systems and lead to genuinely Global Grids, is not yet clear, but with the
tools described here many very useful Grids can be built. Section 7.4, "Local
Authorization," provides an account of the features of Globus mapfiles. Section
7.5, "Site Security Issues," highlights a serious issue with Globus and
firewalls, namely the necessity to keep a range of ports open. Sections 7.6 -
7.9 give advice on moving towards getting real users onto the Grid, including
issues such as high performance networking and batch schedulers. Section 7.11,
Data Management and Your Grid Service Model, provides insights where large
scale data management is an important issue (likely to be the majority of
Grids). Section 7.10, "Grid Systems Administration Tools," discusses a little
progress in Grid administration. Section 7.12, "Take Good Care of the Users as
Early as Possible," describes some of the things that can be done to ease the
transition of users in a Grid environment. This includes some detail on a proxy
certificate management service (MyProxy), since experience has shown that
certificate handling is one of the big barriers to consumer acceptance of
Grids. MyProxy also allows the flexibility of the Globus proxy delegation model
to be exploited via advanced programming models and problem solving portals.
Section 8 provides some concluding remarks, and section 9
attempts to acknowledge the many people who have helped to make IPG and the DOE
Science Girds successes.
Section 10 is an annotated bibliography that is intended to
provide pointers to a lot of additional information, and to acknowledge that
there is a lot of other work going on in Grids that is only mentioned in
passing in this article.
- William E. Johnston
- The NASA IPG Engineering Team, and The DOE Science Grid Team
- wejohnston@lbl.gov
- Received June 2 2002; Comments to Author June 24 2002; Accepted
July 8 2002
- C599: Grids and the Virtual
Observatory
- Abstract:We consider several projects from astronomy that
benefit from the Grid paradigm and associated technology, many of which involve
either massive datasets or the federation of multiple datasets. We cover image
computation (mosaicking, multi-wavelength images, and synoptic surveys);
database computation (representation through XML, data mining, and
visualization); and semantic interoperability (publishing, ontologies,
directories, and service descriptions).
- Roy Williams
- Caltech Center for Advanced Computing Research
- roy@cacr.caltech.edu
- Received June 5 2002; Comments to Author June 24 2002; Accepted
July 10 2002
- C600: The Open Grid Service Architecture
and Data Grids
- Abstract:Data Grids address the data intensive aspects of
Grid computing, and therefore impose a very specific set of requirements on
Grid Services. In this article the Data Grid problem is revisited with emphasis
on Data Management. It investigates how Data Grid Services would need to be
deployed within the Open Grid Services Architecture to fulfill the vision of
the Data Grid
- Peter Z. Kunszt and Leanne P. Guy
- IT Division - Database Group, CERN, 1211 Geneva Switzerland
- Peter.Kunszt@cern.ch
- Received June 10 2002; Comments to Author June 24 2002; Accepted
July 9 2002
- C601: Peer-to-Peer Grid Databases for
Web Service Discovery
- Abstract:Grids are collaborative distributed Internet
systems characterized by large scale, heterogeneity, lack of central control,
multiple autonomous administrative domains, unreliable components and frequent
dynamic change. In such systems, it is desirable to maintain and query dynamic
and timely information about active participants such as services, resources
and user communities. The web services vision promises that programs are made
more flexible, adaptive and powerful by querying Internet databases
(registries) at runtime in order to discover information and network attached
building blocks, enabling the assembly of distributed higher-level components.
In support of this vision, we introduce the Web Service Discovery Architecture
(WSDA), which subsumes an array of disparate concepts, interfaces and protocols
under a single semi-transparent umbrella. WSDA specifies a small set of
orthogonal multi-purpose communication primitives (building blocks) for
discovery, covering service identification, service description retrieval, data
publication as well as minimal and powerful query support. The individual
primitives can be combined and plugged together by specific clients and
services to yield a wide range of behaviors and emerging synergies. Based on
WSDA, we introduce the hyper registry, which is a centralized database node for
discovery of dynamic distributed content. It supports XQueries over a tuple set
from a dynamic XML data model. We address the problem of maintaining dynamic
and timely information populated from a large variety of unreliable, frequently
changing, autonomous and heterogeneous remote data sources. However, in a large
cross-organizational system, the set of information tuples is partitioned over
many such distributed nodes, for reasons including autonomy, scalability,
availability, performance and security. This suggests the use of Peer-to-Peer
(P2P) query technology. Consequently, we propose the WSDA based Unified
Peer-to-Peer Database Framework (UPDF) and its corresponding Peer Database
Protocol (PDP). They are unified in the sense that they allow to express
specific discovery applications for a wide range of data types, node topologies
(e.g. ring, tree, graph), query languages (e.g. XQuery, SQL), query response
modes (e.g. Routed, Direct and Referral Response), neighbor selection policies,
pipelining, timeout and scope policies. We describe the first steps towards the
convergence of Grid Computing, Peer-to- Peer Computing, Distributed Databases
and Web Services. The uniformity and wide applicability of our approach is
distinguished from related work, which (1) addresses some but not all problems,
and (2) does not propose a unified framework.
- Wolfgang Hoschek
- CERN IT Division, European Organization for Nuclear Research,
1211 Geneva 23, Switzerland
- wolfgang.hoschek@cern.ch
- Received June 10 2002; Comments to Author June 28 2002; Accepted
July 9 2002
- C602: Unicore and the Open Grid Services
Architecture
- Abstract:This paper describes the design and
implementation of a GridService demonstrator, built around the Unicore grid
environment. Based on the experience gained in this process, the paper
discusses lessons learned about the Open Grid Services Architecture and the
Grid Service Specification. It identifies several weaknesses and redundant
components in the current draft of the Grid Service Specification and makes
recommendations as to its further development. Finally, we provide some
indication of the directions likely to be taken in further developing the
Unicore infrastructure in light of the Open Grid Systems Architecture.
- David Snelling
- Fujitsu Laboratories of Europe, Hayes Park, Central Hayes End
Road, Hayes, Middlesex UB4 8FE
- d.snelling@fle.fujitsu.com
- Received June 10 2002; Comments to Author June 26 2002; Accepted
July 9 2002
- C603: The Physiology of the Grid
- Abstract:In both e-business and e-science, we often need
to integrate services across distributed, heterogeneous, dynamic "virtual
organizations" formed from the disparate resources within a single enterprise
and/or from external resource sharing and service provider relationships. This
integration can be technically challenging because of the need to achieve
various qualities of service when running on top of different native platforms.
We present an Open Grid Services Architecture that addresses these challenges.
Building on concepts and technologies from the Grid and Web services
communities, this architecture defines a uniform exposed service semantics (the
Grid service); defines standard mechanisms for creating, naming, and
discovering transient Grid service instances; provides location transparency
and multiple protocol bindings for service instances; and supports integration
with underlying native platform facilities. The Open Grid Services Architecture
also defines, in terms of Web Services Description Language (WSDL) interfaces
and associated conventions, mechanisms required for creating and composing
sophisticated distributed systems, including lifetime management, change
management, and notification. Service bindings can support reliable invocation,
authentication, authorization, and delegation, if required. Our presentation
complements an earlier foundational article, "The Anatomy of the Grid," by
describing how Grid mechanisms can implement a service-oriented architecture,
explaining how Grid functionality can be incorporated into a Web services
framework, and illustrating how our architecture can be applied within
commercial computing as a basis for distributed system integration-within and
across organizational domains.
- Ian Foster, Carl Kesselman, Jeffrey M. Nick, Steven Tuecke
- Mathematics and Computer Science Division, Argonne National
Laboratory, Argonne, IL 60439; Department of Computer Science, University of
Chicago, Chicago, IL 60637; Information Sciences Institute, University of
Southern California, Marina del Rey, CA 90292; IBM Corporation, Poughkeepsie,
NY 12601
- foster@mcs.anl.gov
- Received June 14 2002
- C604: Databases and The Grid
- Abstract:This paper examines how databases can be
integrated into the Grid. Almost all early Grid applications are file-based,
and so, to date, there has been relatively little effort applied to integrating
databases into the Grid. However, if the Grid is to support a wider range of
applications, both scientific and otherwise, then database integration into the
Grid will become important. Therefore, this paper investigates the requirements
of Grid-enabled databases and considers how these requirements are met by
existing Grid middleware. This shows that support is very limited. The paper
therefore goes on to propose a service-based architecture, and identifies the
key service functionalities needed to meet the requirements. In this
architecture, database systems are wrapped within a Grid-enabled service
interface that simplifies the task of building applications that access their
contents. The ability to federate data from multiple databases is likely to be
a very powerful facility for Grid users wishing to collate and analyse
information distributed over the Grid. The paper describes how the
service-based framework assists in federating database servers over the Grid by
supporting the creation of virtual databases that present the same service
interface to applications as do the individual, unfederated, database
systems.
- Paul Watson
- Department of Computing Science, University of Newcastle,
Newcastle-upon-Tyne, UK
- Paul.Watson@newcastle.ac.uk
- Received June 14 2002; Comments to Author June 24 2002; Accepted
July 9 2002
- C606: The New Biology and the Grid
- Abstract:Biological and biochemical science has entered a
new era defined by large multi-scale (from atom to organism to populations)
efforts conducted by large teams with fast evolving biotechnology. A product of
this change are vast amounts of data, data which are doubling every 6 months or
so and which needs to be processed and turned into information and hopefully
knowledge. Traditional depth computing - large jobs on a relatively small
number of processors in a large shared memory environment where the tasks are
tightly coupled and operate on a relatively small amount of data are being
accompanied by breadth computing - relatively short calculations performed
identically and independently on a very large number of data points. The grid
is ideally suited for breadth computing and this chapter highlights several
applications that are either using or will shortly benefit from the grid. The
first compares the 3D structures of all known proteins in an effort to
characterize protein fold space and illustrates the importance of attention to
data organization and data flow. The second, a genomic pipeline, the
Encyclopedia of Life, provides putative annotation and 3D models for an
estimated 107 proteins encoded in over 800 genomes illustrates the sheer volume
of processing that is needed. The third, Chemport illustrates a fully
integrated system where the grid is part of an extended workflow infrastructure
supporting not just high performance computation, but data access,
visualization and productivity tools. The chapter concludes with an assessment
of what it will take to realize the promise of the grid for this "new biology."
It turns out that the issues have more to do with establishing synergy between
biologists and grid experts than limitations in the technology
- Kim Baldridge and Philip E. Bourne
- San Diego Supercomputer Center and University of California San
Diego
- bourne@sdsc.edu
- Received June 21 2002; Accepted July 8 2002
- C607: Virtualization Services for Data
Grids
- Abstract:Data Grids provide a set of virtualization
services to enable management and integration of digital entities that are
distributed across multiple sites and storage systems. Virtualization services
include logical name spaces for assigning global, persistent identifiers, and
persistency mechanisms to manage technology obsolescence. Since digital
entities can be represented as combinations of data, information, and
knowledge, the virtualization services provide levels of abstraction for
characterizing operations on data repositories (storage systems), information
repositories (databases), and knowledge repositories. This chapter provides a
survey of concepts that are used for digital entity management and integration
in Data Grids (Section 2). The state of the art in data grid technology is
discussed, including the design of a persistent archive infrastructure, based
upon the convergence of approaches across several different extant Data Grids
(Section 3). Approaches to information integration are also described based on
data warehousing, database integration, and semantic-based data mediation
(Section 4). We conclude in Section 5 with a statement of future research
challenges.
- Reagan W. Moore and Chaitan Baru
- San Diego Supercomputer Center and University of California, San
Diego
- moore@sdsc.edu
- Received June 21 2002; Accepted July 8 2002
- C611: Parameter Sweeps on the Grid with
APST
- Abstract:Parameter sweep applications consist of large
sets of independent tasks and arise in many fields of science and engineering.
Due to their flexible task synchronization requirements, these applications are
ideally suited to large-scale distributed platforms such as the Computational
Grid. However, for users to readily benefit from such platforms, it is
necessary to provide transparent application deployment and automatic
application scheduling. We present here version 2.0 of the AppLeS Parameter
Sweep Template (APST) software, an application execution environment which
schedules and deploys large-scale parameter sweep applications on Grid
platforms. We describe the main principles behind the design of APST, its
current implementation, and the services and mechanisms it can leverage to
deploy applications. We illustrate APST's usability and explain how its
XML-based interface allows users to easily direct Grid resources to run their
applications. Finally, we briefly discuss applications that are currently using
APST, and we highlight future developments.
- Henri Casanova, Fran Berman
- San Diego Supercomputer Center, University of California, San
Diego, 9500 Gilman Dr., La Jolla, CA 92093-0505; Computer Science and
Engineering Department, University of California, San Diego, 9500 Gilman Dr.,
La Jolla, CA 92093-0114
- casanova@cs.ucsd.edu
- Received 28 June 2002; Accepted July 7 2002
- C612: NetSolve: Past, Present, and
Future; A Look at a Grid Enabled Server
- Abstract:NetSolve is grid middleware based on
client-server-agent technology that enables users to solve complex scientific
problems remotely. The system allows users to access both hardware and software
computational resources distributed across the Grid. NetSolve searches for
computational resources on the Grid, chooses the best one available, and solves
a problem using retry for fault-tolerance, and returns the answers to the user.
NetSolve binds grid systems and problem-solving environments, while allowing
the flexibility for a user to write their own front-end or embed a call to
NetSolve using a C or Fortran language client API. This paper examines the
current system, some of the applications using NetSolve, and future directions
for the project.
- Sudesh Agrawal, Jack Dongarra, Keith Seymour, and Sathish
Vadhiyar
- University of Tennessee
- dongarra@cs.utk.edu
- Received July 1 2002; Accepted July 7 2002
- C613: The Data Deluge: An e-Science
Perspective
- Abstract:This paper previews the imminent flood of
scientific data expected from the next generation of experiments, simulations,
sensors and satellites. In order to be exploited by search engines and data
mining software tools, such experimental data needs to be annotated with
relevant metadata giving information as to provenance, content, conditions and
so on. The paper argues the case for creating new types of digital libraries
for scientific data with the same sort of management services as conventional
digital libraries in addition to other data-specific services. Some likely
implications of the Open Archive Initiative and of e-Science data for the
future role for university libraries is then briefly discussed. A substantial
subset of this e-Science data needs to archived and curated for long-term
preservation. Some of the issues involved in the digital preservation of both
scientific data and of the programs needed to interpret the data are then
reviewed. Finally, the implications of this wealth of e-Science data for the
Grid middleware infrastructure are highlighted.
- Tony Hey and Anne Trefethen
- EPSRC, Polaris House, North Star Avenue, Swindon SN2 1 ET, UK;
Department of Electronics and Computer Science, University of Southampton,
Southampton SO17 1BJ, UK
- Tony.Hey@epsrc.ac.uk
- Received July 2 2002, Accepted July 11 2002
- C615: Commodity Grid Kits - Middleware
for Building Grid Computing Environments
- Abstract:Recent Grid projects, such as the Globus Project,
provide a set of useful services such as authentication and remote access to
resources, and information services to discover and query such remote
resources. Unfortunately, these services may not be compatible with the
commodity technologies used for application development by the software
engineers and scientists. Instead, users may prefer accessing the Grid from a
higher level of abstraction than what such toolkits provide. To bridge this
gap, Commodity Grid (CoG) Kits provide the middleware for accessing the
functionality of the Grid from a variety of commodity technologies, frameworks,
and languages. It is important to recognize that these Commodity Grid Kits not
only provide an interface to existing Grid technologies, but also bring Grid
programming to a new level by leveraging the methodologies of the chosen
commodity technology, thus helping the development of the next generation of
Grid services. Based on these Commodity Grid Toolkits, a variety of higher
level Grid services are far easier to design, maintain, and deploy. Several
projects have successfully demonstrated the use of Commodity Grid Kits for the
design of advanced Grid Services and Grid Computing Environments.
- Gregor von Laszewski, Jarek Gawor, Sriram Krishnan and Keith
Jackson
- Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL
60439, U.S.A., Lawrence Berkeley National Laboratory, 1 Cyclotron Rd.,
Berkeley, CA 94720, U.S.A, Indiana University, 150 S. Woodlawn Ave.,
Bloomington, IN 47405, U.S.A.
- gregor@mcs.anl.gov
- Received July 9 2002, Accepted July 9 2002
- C616: The Evolution of the Grid
- Abstract:In this paper we describe the evolution of grid
systems, identifying three generations: first generation systems which were the
forerunners of the Grid as we recognise it today; second generation systems
with a focus on middleware to support large scale data and computation; and
third generation systems where the emphasis shifts to distributed global
collaboration, a service oriented approach and information layer issues. In
particular, we discuss the relationship between the Grid and the World Wide
Web, and suggest that evolving web technologies will provide the basis for the
next generation of the Grid. The latter aspect - which we define as the
Semantic Grid - is explored in a companion paper.
- David De Roure, Mark A. Baker, Nicholas R. Jennings and Nigel R.
Shadbolt
- Universities of Portsmouth and Southampton
- dder@ecs.soton.ac.uk
- Received July 8 2002; Accepted July 9 2002
- C617: NaradaBrokering: An Event Based
Infrastructure for Building Scaleable Durable Peer-to-Peer Grids
- Abstract:We propose an architecture for building a
scaleable durable P2P grid comprising resources such as relatively static
clients, high-end resources and a dynamic collection of multiple P2P
subsystems. Clients in such systems must be linked together in a flexible fault
tolerant efficient high performance fashion. We investigate the architecture,
comprising a distributed brokering system that will support such a hybrid
environment. In this paper, we study the event brokering system -
NaradaBrokering - that will support such a hybrid environment linking clients
(both users and resources of course) together. Keywords: Event distribution
systems, middleware, P2P systems, grid computing, durable messaging
- Geoffrey Fox and Shrideep Pallickara
- PTLIU Labs for Community Grid Computing, Indiana University
- spallick@indiana.edu
- Received July 8 2002; Accepted July 9 2002
- C618: Grid Programming Models: Current
Tools, Issues and Directions
- Abstract: Grid programming must manage computing
environments that are inherently parallel, distributed, heterogeneous and
dynamic, both in terms of the resources involved and their performance.
Furthermore, grid applications will want to dynamically and flexibly compose
resources and services across that dynamic environments. While it may be
possible to build grid applications using established programming tools, they
are not particularly well-suited to effectively manage flexible composition or
deal with heterogeneous hierarchies of machines, data and networks with
heterogeneous performance. This chapter discusses issues, properties and
capabilities of grid programming models and tools to support efficient grid
programs and their effective development. The main issues are outlined and then
current programming paradigms and tools are surveyed, examining their
suitability for grid programming. Clearly no one tool will address all
requirements in all situations. However, paradigms and tools that can
incorporate and provide the widest possible support for grid programming will
come to dominant. Advanced programming support techniques are analyzed
discussing possibilities for their effective implementation on grid
environments.
- Craig Lee and Domenico Talia
- Computer Systems Research Department, The Aerospace Corporation,
P.O. Box 92957, El Segundo, CA USA 87036; DEIS, Universit della Calabria Rende,
CS Italy
- lee@aero.org
- Received July 5 2002; Accepted July 7 2002
- C619: Ninf-G: a GridRPC system on the
Globus Toolkit
- Abstract:We describe here a GridRPC programming system
implemented on top of the Globus Toolkit, called Ninf-G. GridRPC systems enable
easy implementation of parallel applications on the Grid by providing simple
programming RPC-style, task-parallel, and largely Grid-transparent interfaces,
and serve as middleware that glues together Grid applications and the
lower-level Grid substrates such as Globus. We overview the GridRPC model of
Grid programming, and its implementation with Ninf-G, including the client side
API and the server side IDL, as well as its typical use for Grid programming.
We perform preliminary evaluation in both WAN and LAN environment,
demonstrating that the overhead of Ninf-G is reasonable after the first-time
overhead of GSI authentication is tolerated.
- Hidemoto Nakada, Yoshio Tanaka, Satoshi Matsuoka, Staoshi
Sekiguchi
- National Institute of Advanced Industrial Science and
Technology, Grid Technology Research Center, Tsukuba Central 2, 1-1-1 Umezono,
Tsukuba, Ibaraki 305-8568, JAPAN; Tokyo Institute of Technology, Global
Scientific Information and Computing Center, 2-12-1 Ookayama, Meguro-ku, Tokyo,
152-8550, JAPAN
- hide-nakada@aist.go.jp
- Received July 9 2002; Accepted July 10 2002
- C620: Autonomic Computing and GRID
- Abstract:Modern servers face a serious design challenge.
Since a server tends to be a conglomeration of several distributed services,
the complexity of composing a server grows very rapidly. In addition, each
component in the system faces unpredictable variability in its inputs and
demands for quality of service. Unless a component can react quickly to such
variations, it will not be able to perform well. Furthermore, as systems grow
and become more and more sophisticated, one invariably ends up in a distributed
and heterogeneous environment. This makes it impractical to have a centralized
control that monitors input variability and adjusts resources.
To face these challenges, the servers are being modularly
designed, with each component providing a standard interface3. Components are
no longer expected to provide a rigid and deterministic functional behavior in
all aspects of their interactions with other components. They are expected to
efficiently deal with the varying behaviors of other components, including
faults. As centralized controls become infeasible, components must be prepared
to perceive changes and to negotiate the exchange of resources on a voluntary
basis. In summary, components must be self-governing, self-organizing,
self-stabilizing and self-healing in the surrounding world of unpredictable
components. The attribute, autonomic is used to capture this design philosophy.
An essential design feature in modern servers is to make each component
autonomic. This is important in order to contain the system-level complexity.
While each component is designed to cope with the idiosyncrasies of other
interacting components, we must also ensure that the system as a whole
coordinates itself to achieve the system-level goals. While there are many
subsystems in the current servers that are self-organizing to various degrees,
it is necessary to formulate a common framework for every autonomic component
and to develop general principles for their stability.
The Grid Computing attempts to provide a common platform in
which controlled interactions can take place in a distributed, heterogeneous
environment and shares many of the objectives with regard to interactions with
other components. Discovery of other services, negotiating terms for services
from other components, monitoring the service quality rendered and adjusting
the negotiations are some examples. The Grid platform attempts to standardize
the protocols needed for some of these transactions and may also provide some
tools to implement certain flavors of them.
In this paper, we give an operational definition of an
autonomic component of a server and identify the key features it must possess
to be effective in the emerging environment. We then examine a scenario in the
grid environment and argue that design of subsystems within the grid
environment must naturally face the concerns addressed by autonomic designs.
Hence, there is synergy between these two perspectives and one stands to gain
by blending them together.
- Pratap Pattnaik, Kattamuri Ekanadham and Joefon Jann
- T. J. Watson Res. Center, Yorktown Heights, NY 10598
- pratap@us.ibm.com
- Received July 9 2002; Accepted July 10 2002
- C621: Peer-to-Peer Grids
- Abstract:We describe Peer-to-Peer Grids built around
Integration of technologies from the peer-to-peer and Grid fields. We focus on
the role of Web services linked by a powerful event service using uniform XML
interfaces and application level routing. We describe how a rich synchronous
and asynchronous collaboration environment can support virtual communities
built on top of such infrastructure. Universal access mechanisms are
discussed.
- Geoffrey Fox, Dennis Gannon, Sung-Hoon Ko, Sangmi Lee, Shrideep
Pallickara, Marlon Pierce, Xiaohong Qiu, Xi Rao Ahmet Uyar, Minjun Wang, Wenjun
Wu
- Pervasive Technology Laboratories Indiana University
- gcf@indiana.edu
- Received July 7 2002; Accepted July 10 2002
- C622: Grid Web Services and Application
Factories
- Abstract:This paper describes an implementation of a Grid
Application Factory Service that is based on a component architecture that
utilizes the emerging Web Services standards. The factory service is used by
Grid clients to authenticate and authorize a user to configure and launch an
instance of a distributed application. This helps us solve the problem of
building reliable, scalable Grid applications, by separating the process of
deployment and hosting from application execution. The paper also describes how
these component-based applications can be made compatible with the Open Grid
Service Architecture(OGSA) and how OGSA concepts enhance the usability of the
component framework.
- Dennis Gannon, Rachana Ananthakrishnan, Sriram Krishnan,
Madhusudhan Govindaraju Lavanya Ramakrishnan, Aleksander Slominski
- Department of Computer Science Indiana University
- gannon@cs.indiana.edu
- Received 11 July 2002, Accepted July 11 2002
- C623: The Grid Portal Development
Kit
- Abstract:Computational science portals are emerging as
useful and necessary interfaces for performing operations on the Grid. The Grid
Portal Development Kit (GPDK) facilitates the development of Grid portals and
provides several key reusable components for accessing various Grid services. A
Grid Portal provides a customizable interface allowing scientists to perform a
variety of Grid operations including remote program submission, file staging,
and querying of information services from a single, secure gateway. The Grid
Portal Development Kit leverages off existing Globus/Grid middleware
infrastructure as well as commodity web technology including Java Server Pages
and servlets. The design and architecture of GPDK is presented as well as a
discussion on the portal building capabilities of GPDK allowing application
developers to build customized portals more effectively by reusing common core
services provided by GPDK.
- Jason Novotny
- Lawrence Berkeley National Laboratory
- Email: JDNovotny@lbl.gov
- Received 29 June 2001; Revised 30 November 2001; Accepted January
6 2002
- C625: Distributed object-based grid
computing environments
- Abstract:We review the basic architectures and services of
the Gateway and Mississippi Computational Web Portals. These portals are
designed to provide seamless access to remote software, hardware, and data
through the user's web browser. This is accomplished in both cases by
implementing an architecture that follows the classic three-tiered design, with
user environment and backend resources separated by a middle control layer that
is implemented as a set of distributed objects. These middle tier objects act
as service proxies that wrap the backend resources. In this paper we review
basic and advanced portal services, describe our use of application metadata,
and examine security requirements for three-tiered architectures. Finally, we
discuss future directions for portal development, considering the impact of Web
services and portlet technologies.
- Tomasz Haupt, Marlon E. Pierce
- Engineering Research Center, Mississippi State University,
Starkville, MS 39762; Community Grids Lab, Indiana University, Bloomington, IN
47404
- marpierc@indiana.edu
- Received 11 July 2002, Accepted July 11 2002
- C626: Storage Manager and File Transfer
Web Services
- Abstract:Web services are emerging as an interesting
mechanism for a wide range of grid services, particularly those focused upon
information services and control. When coupled with efficient data transfer
services, they provide a powerful mechanism for building a flexible, open,
extensible data grid for science applications. In this paper we present our
prototype work on a Java Storage Resource Manager (JSRM) web service and a Java
Reliable File Transfer (JRFT) web service. A java client (Grid File Manager) on
top of JSRM and JRFT is developed to demonstrate the capabilities of these web
services. The purpose of this work is to show the extent to which SOAP based
web services are an appropriate direction for building a grid-wide data
management system, and eventually grid-based portals.
- William A. Watson, Ying Chen, Jie Chen, Walt Akers
- Thomas Jefferson National Accelerator Facility Newport News,
Virginia 23606, U.S.A.
- Chip.Watson@jlab.org
- Received 11 July 2002, Accepted July 11 2002
- C627: From Legion to Avaki: The
Persistence of Vision
- Abstract:Grids have metamorphosed from academic projects
to commercial ventures. Avaki, a leading commercial vendor of Grids, has its
roots in Legion, a Grid project at the University of Virginia begun in 1993. In
this chapter, we present fundamental challenges and requirements for Grid
architectures that we believe are universal, our architectural philosophy in
addressing those requirements, an overview of Legion as used in production
systems and a synopsis of the Legion architecture and implementation. We also
describe the history of the transformation from Legion - an academic, research
project - to Avaki, a commercially supported, marketed product. Several of the
design principles as well as the vision underlying Legion have continued to be
employed in Avaki. As a product sold to customers, Avaki has been made more
robust, more easily manageable and easier to configure than Legion, at the
expense of eliminating some features and tools that are of less immediate use
to customers. Finally, we place Legion in the context of OGSI, a standards
effort underway in Global Grid Forum.
- Andrew S. Grimshaw, Anand Natrajan, Marty A. Humphrey, Michael J.
Lewis, Anh Nguyen-Tuong, John F. Karpovich, Mark M. Morgan, Adam J. Ferrari
- University of Virginia, Avaki Corporation
- agrimshaw@avaki.com
- Received 11 July 2002, Accepted July 11 2002
- C628: Classifying and enabling grid
applications
- Abstract:Todays applications need the functionality of the
Grid to break free of single resource limitations, and in turn, the Grid needs
applications in order to properly evolve. Why then are there currently so few
applications using grids? We describe some of the problems faced by application
developers in moving to the Grid, and show how grid application frameworks
should overcome these difficulties. Such frameworks will define the
relationship between the Grid and applications, providing consistent abstract
interfaces to grid operations and allowing applications to include these
operations independently from their actual implementation. These user
interfaces should be application focused, capturing the semantics of the
underlying operations, then driving grid development to support these needs.
Building the right interfaces motivates a detailed classification of
applications and operations for the grid, and we provide a simple taxonomy in
this paper, outlining important new directions in which it should be extended.
We describe the rationale and design of a Grid Application Toolkit in the
GridLab project, which will provide a comprehensive language and implementation
neutral framework for encapsulating grid operations that will greatly simplify
and accelerate future grid application development.
- Gabrielle Allen, Tom Goodale, Michael Russell, Edward Seidel and
John Shalf
- Max-Planck-Institut fur Gravitationsphysik, Golm, Germany;
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- eseidel@aei.mpg.de
- Received 12 July 2002, Accepted July 12 2002
- C629: Building Grid Computing Portals:
The NPACI Grid Portal Toolkit
- Abstract:Portals have become established as effective
interfaces for enabling scientific users to access information and services in
Grid computing environments, as demonstrated by the success of the NPACI
HotPage and numerous other Grid computing portals. Grid portals hide the
complexities of Grid technologies from the user and present simplified,
intuitive interfaces for harnessing the power of the underlying resources. Grid
portal toolkits that interface to Grid technologies have proven useful,
enabling developers to rapidly build portals. The NPACI Grid Portal Toolkit
(GridPort) facilitates development and utilization of Grid technologies such as
the Globus Toolkit and the Storage Resource Broker from within an integrated,
unified API. GridPort supports a set of centralized services that allow
multiple application portals to share services and a single-login environment.
The advent of new technologies such as Grid Web services has created new
opportunities for making Grid portal toolkits that are even simpler for
developers and GridPort is being enhanced to include these new technologies.
New portal technologies such as customization and portlets are being integrated
into toolkits such as GridPort to make it possible to develop portals that
allow users to meet their personal needs more effectively. The impact of these
new technologies will result in increased development, utilization, and more
effective use of Grid portals. In this paper we describe GridPort, the HotPage,
and GridPort-based application portals, as well as the next version of GridPort
that integrates Grid Web services and supports a variety of Grid Computing
Environments.
- Mary Thomas and Jay Boisseau
- Texas Advanced Computing Center, The University of Texas at
Austin
- mthomas@tacc.utexas.edu
- Received August 2 2002, Accepted August 2 2002
- C630: DISCOVER: A Computational
Collaboratory for Interactive Grid Applications
- Abstract:The growth of the Internet and the advent of the
computational Grid have made it possible to develop and deploy advanced
services and computational collaboratories on the Grid. These systems build on
high-end computational resources and communication technologies, and enable
seamless and collaborative access to resources, applications and data. In this
chapter we present an overview of the DISCOVER computational collaboratory for
enabling interactive applications on the Grid. Its primary goal is to bring
large distributed Grid applications to the scientists/engineers
desktop and enable collaborative application monitoring, interaction and
control. DISCOVER is composed of 3 key components: (1) a middleware substrate
that integrates DISCOVER servers and enables interoperability with external
Grid services, (2) an application control network consisting of sensors,
actuators, and interaction agents that enable monitoring, interaction and
steering of distributed applications, and (3) detachable portals for
collaborative access to grid applications and services. The design,
implementation, operation and evaluation of these components is presented.
- V. Mann and M. Parashar
- The Applied Software Systems Laboratory, Department of Electrical
and Computer Engineering, Rutgers, The State University of New Jersey, 94 Brett
Road, Piscataway, NJ 08854.
- parashar@caip.rutgers.edu
- Received 13 July 2002, Accepted July 13 2002
- C631: Combinatorial Chemistry and the
Grid
- Abstract:Chemistry has always made extensive use of the
developing computing technology and available computing power though activities
such as modelling, simulation and chemical structure interpretational;
activities conveniently summarised as computational chemistry. Developing
procedures in chemical synthesis and characterisation, particularly in the
arena of parallel and combinatorial methodology, have generated ever increasing
demands on both Computational Chemistry and Computer Technology. However,
significantly the way in which networked services are being conceived to assist
collaborative research pushes the use of data acquisition, remote interaction
& control, computation, and visualisation, well beyond the traditional
computational chemistry programmes, towards the basic issue of handling
Chemical information and knowledge. The rate at which new chemical data can now
be generated in Combinatorial and Parallel synthesis and screening processes,
means that the data can only realistically be handled efficiently by increased
automation of the data analysis as well as the experimentation and collection.
Without this automation we run the risk of generating information without the
ability to understand it.
- J. G. Frey, M. Bradley, J. W. Essex, M.B. Hursthouse, S. M.
Lewis, M. M. Luck, L. A.V.M. Moreau, D.C. De Roure, M. Surridge, A. H. Welsh
- Department of Chemistry, Department of Electronics & Computer
Science, and Department of Mathematics; University of Southampton, Southampton,
SO17 1BJ, UK
- j.g.frey@soton.ac.uk
- Received 15 July 2002, Accepted July 15 2002
- C632: Data Intensive Grids for High
Energy Physics
- Abstract:The major high energy physics experiments of the
next twenty years will break new ground in our understanding of the fundamental
interactions, structures and symmetries that govern the nature of matter and
space-time. Among the principal goals are to find the mechanism responsible for
mass in the universe, and the "Higgs" particles associated with mass
generation, as well as the fundamental mechanism that led to the predominance
of matter over antimatter in the observable cosmos.
The largest experiments
preparing for CERN's Large Hadron Collider (LHC) program, each encompass 2000
physicists from 150 institutions in more than 30 countries. These experiments
plan to collect and analyse several Petabytes of data (1 PB = 10^15 Bytes) in
the first year of operation. Indeed, the current generation of operational
experiments at SLAC (BaBar) and Fermilab (D0 and CDF), as well as the
experiments at the Relativistic Heavy Ion Collider (RHIC) program at BNL, are
beginning to face similar challenges. BaBar in particular has already
accumulated datasets approaching a Petabyte.
Collaborations on the global
scale of the LHC experiments would not have been attempted if physicists could
not plan on excellent networks: to interconnect the physics groups throughout
the lifecycle of the experiment, and to make possible the construction of Data
Grids capable of providing access, processing and analysis of massive datasets.
These datasets will increase in size from Petabytes to Exabytes (1 EB = 10^18
Bytes) within the next decade.
In this Chapter, we explore the computing
challenges posed by the latest and next generations of HEP experiments, and
describe the various Grid projects that have been initiated by the HEP
community in response. Some historically significant projects are reviewed,
followed by an outline of each of the major HEP-related Grid projects currently
underway. Selected examples of how Grid technology is being actively used in
the experiments are presented. This is followed by an analysis of the distinct
differences between "Classical" and "HEP" Grids, and the very significant roles
that R&D in wide area networking and global systems modelling and
simulation have to play. To illustrate some current research work, we describe
the MONALISA architecture for a service-oriented Grid system for HEP computing
task monitoring, scheduling and control. We continue by examining the use of
Grid technology for HEP analysis tasks. The Grid Enabled Analysis environment
(GAE), which makes extensive use of Web Services, is introduced and
described.
In the summary, we suggest various ways in which meeting the HEP
computing challenges will benefit future networks and society.
- Julian J. Bunn, Harvey B Newman
- California Institute of Technology, Pasadena, CA 91125, USA
- julian@cacr.caltech.edu
- Received 19 July 2002, Accepted July 20 2002
- C633: Rationale for Developing with the
Open Grid Services Architecture
- Abstract:The UK e-Science Core Programme established an
Architectural Task Force to map out the UK's policy regarding grid
architectures. A decision had already been taken to work closely with the
Globus team and to use Globus Toolkit Version 2. The emergence of the Open Grid
Services Architecture proposal in late 2001 was greeted warmly as a well-chosen
direction that would support UK e-Science requirements and provide
opportunities for the UK e Science projects to contribute to future grid
infrastructure. Early contributions are expected to include database access and
integration, and grid markets. A longer-term goal is to lift the level of
discourse when integrating software and data components, and to automate more
of the routine work of assembling and operating e-Science applications and
infrastructure. This paper is based on the first report of that task force,
suggesting a road map that was accepted by the UK e-Science community in April
2002.
- Malcolm Atkinson
- The UK e-Science Architecture Task Force
- mpa@dcs.gla.ac.uk
- Received 20 July 2002, Revised August 7 2002, Accepted August 7
2002
- C634: eDiamond: a grid-enabled
federated database of annotated mammograms
- Abstract: This chapter introduces a project named eDiamond
which aims to develop a grid-enabled federated database of annotated
mammograms, built at a number of sites (initially in the UK), and which ensures
database consistency and reliable image processing. A key feature of eDiamond
is that images are "standardised" prior to storage. We describe what this
means, and why it is a fundamental requirement for numerous grid applications,
particularly in medical image analysis, and especially in mammography. The
eDiamond database will be developed with two particular applications in mind:
teaching and supporting diagnosis. There are several other applications for
such a database, which are the subject of related projects. We also discuss the
ways in which information technology (IT) is impacting on the provision of
health care - a subject which in Europe is called Healthcare Informatics. We
outline some of the issues concerning medical images, and describe mammography
as an important special case. We discuss medical image databases, as a prelude
to the description of the eDiamond e-science project. We relate the eDiamond
project to a number of other efforts currently underway, most notably the US
NDMA project.
- Michael Brady, David Gavaghan, Andrew Simpson, Miguel Mulet
Parada, Ralph Highnam
- Medical Vision Laboratory, Department of Engineering Science,
Oxford University; Computing Laboratory, Wolfson Building, Parks Road, Oxford;
Mirada Solutions Limited, Oxford Centre for Innovation, Mill Street, Oxford
- jmb@robots.ox.ac.uk
- Received 31 July 2002, Accepted July 31 2002
- C644: Condor and the Grid
- Abstract:Since 1984, the Condor project has helped
ordinary users to do extraordinary computing. Today, the project continues to
explore the social and technical problems of cooperative computing on scales
ranging from the desktop to the world-wide computational grid. In this chapter,
we describe the history and philosophy of the Condor project and describe how
it has evolved along with the field of distributed computing. We describe many
of the positive interactions between Condor and other computing projects and
discuss how the transfer of technologies and ideas benefits everyone. We
discuss the distinction between planning and scheduling in an uncertain
environment and the need to construct computing communities that reflect
underlying human organizations. Throughout, we reflect on the lessons of
experience, discussing techniques that failed alongside those that
succeeded
- Douglas Thain, Todd Tannenbaum, and Miron Livny
- Dept of Computer Sciences, Univ of Wisconsin-Madison, 1210 W.
Dayton St. Madison, WI 53705
- tannenba@cs.wisc.edu
- Received August 26 2002, Accepted August 26 2002
- C645: Education and the Enterprise with
the Grid
- Abstract: The Grid enables collaboration between people
and electronic resources around the globe. This technology casn be applied to
education and broad information dissemination in a way that will bring the
possibility of quality education and information for all. we discuss the
implications for enterprise (education) systems -- what should you do to
enable and enhance the integration of Grid systems; the virtual
university -- what does this need and what might it look like; re-usable
information -- the promise of a universal service architecture to greater
re-useability of (curriculum) components.
- Geoffrey Fox
- Community Grids Laboratory, 501 N. Morton St Suite 224,
Bloomington IN 47404-3730
- gcf@indiana.edu
- Received August 20 2002, Accepted September 7 2002
- C646: Overview of Book
- Abstract:This short preamble gives an overview of the
whole book with a special emphasis on the architecture section. It is designed
to help reader plan their use of the book.
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC,
Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Recived August 11 2002, Accepted September 24 2002
- C647: The Grid: Past, Present,
Future
- Abstract:This describes why the Grid is important and its
technologies and applications to Science and the Enterprise.In section 2 of
this chapter, we highlight some historical and motivational building blocks of
the Grid - described in more detail in chapter 3. Section 3 describes the
current community view of the Grid with its basic architecture. Section 4
contains four building blocks of the Grid. In particular section 4.1, we review
the evolution of networking infrastructure including both the desktop and cross
continental links, which are expected to reach Gigabit and Terabit performance
respectively over next 5 years. Section 4.2 presents the corresponding
computing backdrop with 1 to 40 teraflop performance today moving to petascale
system at the end of the decade. We use the new NSF TeraGrid as the HPCC (High
Performance Computing and Communications) highlight today. Section 4.3
summarizes many of the regional, national and international activities
designing and deploying Grids. Standards, covered in section 4.4 are a
different but equally critical building block of the Grid. Section 5 covers the
critical area of applications on the Grid covering life sciences, engineering
and the physical sciences. We highlight new approaches to Science including the
importance of collaboration and the e-Science concept driven partly by
increased data. A short section on commercial applications includes the
e-Enterprise/Utility concept of computing power on demand. Applications are
summarized in Section 5.7, which discusses the characteristic features of "good
Grid" applications like the one shown in Figure 2. This shows an instrument
(the photon source at Argonne National Laboratory) linked to computing, data
archiving and visualization facilities in a local Grid. Part D and chapter 35
of the book describe this in more detail. Futures are covered in Section 6 with
the intriguing concept of autonomic computing developed originally by IBM
covered in section 6.1 and chapter 13. Section 6.2 is a brief discussion of
Grid programming covered in depth in chapter 20 and part C of the book. There
are concluding remarks in sections 6.3 to 6.5. We stress that this chapter is
designed to give the high level motivation for the book; it is not a complete
review. In particular there only a few general references to many key subjects
because the later chapters of the book and its associated web site will give
you these. Further although book chapters are referenced, this is not a
reader's guide to the book, which is given in fact in the preceding preface.
Further chapters 20 and 35 are guides to parts C and D of the book. The reader
will find in another box possibly useful comments on parts A and B of this
book. This overview is based on presentations by Berman and Hey, conferences
and a collection of presentations from Indiana University on networking .
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC,
Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Received August 27 2002; Accepted September 28 2002
- C648: Overview of Grid Computing
Environments
- Abstract: This short chapter summarizes the current status
of Grid Computatational and Programming environments. It puts the corresponding
section of the book in context and integrates into a survey a set of 28 papers
gathered together by the GCE (Grid Computing Environment) group of the Global
Grid Forum. This set is published in 2002 as a special issue of Concurrency and
Computation: Practice and Experience.
- Geoffrey Fox, Dennis Gannon, Mary Thomas
- Community Grids Laboratory Indiana University; Department of
Computer Science, 215 Lindley Hall,150 S. Woodlawn Ave,. Bloomington IN
47405-7104; Texas Advanced Computing Center, The University of Texas at Austin,
10100 Burnet Road, Austin, Texas 78758
- gcf@indiana.edu
- Received August 6 2002; Accepted September 7 2002
- C649: Application Overview for the Book:
Grid Computing: Making the Global Infrastructure a Reality
- Abstract:This book, Grid Computing: Making the Global
Infrastructure a Reality, is divided into four parts. This short chapter
introduces the last part D on applications of the Grid and is not designed to
review all applications but rather guide the reader interested in general
principles on or particular examples of applications that are using or could
use the Grid. All chapters have material relevant for Grid applications but in
this part the focus the application itself. There are some of the previous
chapters, which in fact cover applications as part of an overview or to
illustrate a technological issue. We will in particular discuss chapters 1, 2,
11, 12, 16, 23, 24, 28, 30 and 33 in parts A, B and C below. This is in
addition to chapters 37-43 devoted to applications in this part.
- Fran Berman, Geoffrey Fox, Tony Hey
- SDSC San Diego; Community Grids Laboratory Indiana; EPSRC,
Polaris House, North Star Avenue, Swindon SN2 1 ET, UK
- gcf@indiana.edu
- Received August 27 2002; Accepted September 24 2002
- C654: Metacomputing
- Abstract: This article from 1992 describes the vision of
metacomputing wshich was one of the early harbingers of the Grid ten years
later. The metacomputer is defined as a network of heterogeneous, computational
resources linked by software in such a way that they can be used as easily as a
personal computer.
This article looks at the three stages of metacomputing,
beginning with the local area metacomputer at the National Center for
Supercomputing Applications (NCSA) as an example of the first stage. The
capabilities to be demonstrated in the SIGGRAPH'92 Showcase environment
represent the beginnings of the second stage in metacomputing. This involves
advanced user interfaces that allow for participatory computing as well as
examples of capabilities that would not be possible without the underlying
stage one metacomputer. The third phase, a national metacomputer, is on the
horizon as these new capabilities are expanded from the local metacomputer out
onto Gbit/sec network testbeds.
Several examples are given that have
striking similarities to those exploiting grids in 2002.
- Larry Smarr, Charles E. Catlett
- National Center for Supercomputing Applications NCSA, 605 East
Springfield Ave., Champaign, IL 61820
- catlett@mcs.anl.gov
- Reprint of L. Smarr and C. E. Catlett. Metacomputing.
Communications of the ACM, 35(6):44--52, June 1992
- C656: Grid Resource Allocation and
Control Using Computational Economies
- Abstract: In this paper, we investigate G-commerce ---
computational economies for controlling resource allocation in Computational
Grid settings. We define hypothetical resource consumers (representing users
and Grid-aware applications) and resource producers (representing resource
owners who ``sell'' their resources to the Grid). We then measure the
efficiency of resource allocation under two different market conditions:
commodities markets and auctions. We compare both market strategies in terms of
price stability, market equilibrium, consumer efficiency, and producer
efficiency. Our results indicate that commodities markets are a better choice
for controlling Grid resources than previously defined auction strategies.
- Rich Wolski, Todd Bryan, James Plank, John Brevik
- Computer Science Department, University of California, Santa
Barbara Santa Barbara, CA 93106; Computer Science Department, University of
Tennessee, Knoxville Knoxville, TN 37996; Mathematics Department, Wheaton
College, Wheaton, Mass.
- rich@cs.ucsb.edu
- Recieved September 3 2002; Accepted September 3 2002
- C657: Architecture of an Commercial
Enterprise Desktop Grid: The Entropia System
- Abstract: Distributed Computing, the exploitation of idle
cycles on desktop PC systems, offers the opportunity to increase the available
computing power by orders of magnitude (10x - 1000x). However, for desktop PC
distributed computing to be widely accepted within the enterprise, the systems
must achieve high levels of efficiency, robustness, security, scalability,
unobtrusiveness, and manageability. We describe the Entropia Distributed
Computing System as a case study, detailing its internal architecture and
philosophy in attacking these key problems. Key aspects of the Entropia system
include the use of: 1) scalable web/database technology for system management,
2) application namespaces for machines, 3) binary sandboxing technology for
security and unobtrusiveness, and 4) open integration model to allow
applications from many sources to be incorporated. The Entropia system has been
deployed in a wide range of commercial environments and used for a range of
applications from areas including bioinformatics, molecular modeling, and monte
carlo financial simulations.
- Andrew A. Chien
- Entropia, Inc 10145 Pacific Heights, Suite 800 San Diego, CA
92121 and University of California, San Diego Department of Computer Science
and Engineering 9500 Gilman Drive La Jolla, CA 92093
- achien@cs.ucsd.edu
- Recieved September 3 2002; Accepted September 3 2002