CPE708:A stochastic load balancing algorithm for Computing
- Author(s):Yuk Yin Wong,Kwong Sak Leung,Kin Hong Lee
- Computing Services
Centre, City University of Hong Kong, Tat Chee
Avenue, Kln., Hong Kong,>Department of
Computer Science and Engineering, Chinese University
of Hong Kong, Shatin, N.T., Hong,
http://dx.doi.org/10.1002/cpe.708
light wiley document
wiley 2003 full documents
Abstract:This paper presents a stochastic
dynamic load balancing algorithm for Internet computing,
which is a new type of distributed computing involving
heterogeneous workstations from different organizations on
the Internet. To realize the practical environment, we
assume the system to be comprised of heterogeneous,untrusted
and non dedicated workstations connected by a non dedicated
network. Our algorithm uses the product of the average
processing time and the queue length of system jobs as the
load index. Dynamic communication delay is included in the
execution cost calculation. The transfer policy and the
location policy are combined in a stochastic algorithm.
State information exchange is done via information feedback
and mutual updating. Simulations demonstrate that our
algorithm outperforms conventional approaches over a wide
range of system parameters. These results are reconfirmed by
empirical experiments after we have implemented the
algorithms on the Distributed Java Machine global virtual
machine. Copyright 2003 John Wiley Sons, Ltd.
CPE706:A comparison of concurrent programming and
cooperative multithreading
- Author(s):Ronald A Olsson,Tiejun Li,Eugene F. Fodor,Aaron W. Keen
- Department of Computer
Science, University of California, Davis, CA 95616, U.S.A.,
- http://dx.doi.org/10.1002/cpe.706
- light wiley document
- wiley 2003 full documents
- Abstract:This paper presents a comparison of the
cooperative multithreading model with the general concurrent
programming model.It focuses on the execution time
performance of a range of standard concurrent programming
applications. The overall results are mixed. In some cases,
programs written in the cooperative multithreading model
outperform those written in the general concurrent
programming model. The contributions of this paper are
twofold. First, it presents a thorough analysis of the
performances of applications in the different models, i.e.
to explain the criteria that determine when a program in
one model will outperform an equivalent program in the
other. Second, it examines the tradeoffs in writing programs
in the different programming styles. In some cases, better
performance comes at the cost of more complicated code.
Copyright 2003 John Wiley Sons, Ltd.
CPE704:The Virtual Laboratory: a toolset to enable
distributed molecular modelling for drug design on the World
Wide Grid
- Author(s):Rajkumar Buyya,Kim Branson,Jon Giddy,DavidA Abramson
- Grid Computing and
Distributed Systems Laboratory, Department of
Computer Science and Software Engin,Structural Biology,
Walter and Eliza Hall Institute, Royal Parade,
Parkville, Melbourne, Australia,Welsh Science Centre,
Department of Computer Science, Cardiff University,
Cardiff, U.K.,School of Computer Science
and Software Engineering, Monash University,
Caulfield Campus, Melbourne
http://dx.doi.org/10.1002/cpe.704
light wiley document
wiley 2003 full documents
Abstract:Computational Grids are emerging as a
new paradigm for sharing and aggregation of geographically
distributed resources for solving large scale compute and
data intensive problems in science, engineering and
commerce. However, application development, resource
management and scheduling in these environments is a complex
undertaking. In this paper, we illustrate the development
of a Virtual Laboratory environment by leveraging existing
Grid technologies to enable molecular modelling for drug
design on geographically distributed resources. It involves
screening millions of compounds in the chemical database CDB
against a protein target to identify those with potential
use for drug design. We have used the Nimrod G parameter
specification language to transform the existing molecular
docking application into a parameter sweep application for
executing on distributed systems. We have developed new
tools for enabling access to ligand records ;molecules in
the CDB from remote resources. The Nimrod G resource broker
along with molecule CDB data broker is used for scheduling
and on demand processing of docking jobs on the World Wide
Grid ;WWG ; resources. The results demonstrate the ease of
use and power of the Nimrod G and virtual laboratory tools
for grid computing. Copyright 2003 John Wiley Sons, Ltd.
CPE709:Object oriented distributed computing based on
remote class reference
- Author(s):David W. Walker,Yan Huang,Omer F. Rana
- Department of Computer
Science, Cardiff University, P.O. Box 916, Cardiff
CF24 3XF, U.K,
- http://dx.doi.org/10.1002/cpe.708
- light wiley document
- wiley 2003 full documents
- Abstract:Java RMI, Jini and CORBA provide
effective mechanisms for implementing a distributed
computing system. Recently many numeral libraries have been
developed that take advantage of Java as an object oriented
and portable language.The widely used client server method
limits the extent to which the benefits of the object
oriented approach can be exploited because of the
difficulties arising when a remote object is the argument or
return value of a remote or local method. In this paper this
problem is solved by introducing a data object that stores
the data structure of the remote object and related access
methods. By using this data object, the client can easily
instantiate a remote object, and use it as the argument or
return value of either a local or remote method. Copyright
2003 John Wiley Sons, Ltd.
CPE713:Modular specification of frame properties in JML
- Author(s):Gary T. Leavens ,Arnd Poetzsch-Heffter ,Peter Müller
- Department of Computer
Science, Iowa State University, 229 Atanasoff Hall,
Ames, IA 50011-1040, U.S.,Universität
Kaiserslautern, 67653 Kaiserslautern, Germany,Deutsche Bank AG,
Frankfurt, Germany
http://dx.doi.org/10.1002/cpe.713
light wiley document
wiley 2003 full documents
Abstract:We present a modular specification
technique for frame properties. The technique uses modifies
clauses and abstract fields with declared dependencies.
Modularity is guaranteed by a programming model that
enforces data abstraction by preventing representation and
argument exposure, a semantics of modifies clauses that uses
a notion of 'relevant location', and by modularity
rules for dependencies. For concreteness, we adapt this
technique to the Java Modeling Language, JML. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE714:Improving the official specification of Java
bytecode verification
- Author(s):Alessandro Coglio
- Kestrel Institute,
3260 Hillview Avenue, Palo Alto, CA 94304, U.S.A.
http://dx.doi.org/10.1002/cpe.714
light wiley document
wiley 2003 full documents
Abstract:Bytecode verification is the main
mechanism to ensure type safety in the Java Virtual Machine.
Inadequacies in its official specification may lead to
incorrect implementations where security can be broken
and/or certain legal programs are rejected. This paper
provides a comprehensive analysis of the specification,
along with concrete suggestions for improvement. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE711:Clustering revealed in high-resolution simulations
and visualization of multi-resolution features in
fluid-particle models
- Author(s):David A. Yuen,Krzysztof Boryczko,Witold Dzwinel
- Minnesota
Supercomputing Institute, University of Minnesota,
Minneapolis, MN 55455-1227, U.S.,1AGH Institute of
Computer Science, al. Mickiewicza 30, 30-059,
Kraków, Poland,
- http://dx.doi.org/10.1002/cpe.711
- light wiley document
- wiley 2003 full documents
- Abstract:Simulating natural phenomena at greater
accuracy results in an explosive growth of data. Large-scale
simulations with particles currently involve ensembles
consisting of between 106 and
109 particles, which cover
105-106 time
steps. Thus, the data files produced in a single run can
reach from tens of gigabytes to hundreds of terabytes. This
data bank allows one to reconstruct the spatio-temporal
evolution of both the particle system as a whole and each
particle separately. Realistically, for one to look at a
large data set at full resolution at all times is not
possible and, in fact, not necessary. We have developed an
agglomerative clustering technique, based on the concept of
a mutual nearest neighbor (MNN). This procedure can be
easily adapted for efficient visualization of extremely
large data sets from simulations with particles at various
resolution levels. We present the parallel algorithm for MNN
clustering and its timings on the IBM SP and SGI/Origin 3800
multiprocessor systems for up to 16 million fluid particles.
The high efficiency obtained is mainly due to the similarity
in the algorithmic structure of MNN clustering and particle
methods. We show various examples drawn from MNN
applications in visualization and analysis of the order of a
few hundred gigabytes of data from discrete particle
simulations, using dissipative particle dynamics and fluid
particle models. Because data clustering is the first step
in this concept extraction procedure, we may employ this
clustering procedure to many other fields such as data
mining, earthquake events and stellar populations in nebula
clusters. Copyright © 2003 John Wiley & Sons, Ltd.
CPE705:MPI-CHECK: a tool for checking Fortran 90 MPI programs
- Author(s):Glenn Luecke,Hua Chen,James Coyle,Jim Hoekstra,Yan Zou,Marina Kraeva
- High Performance
Computing Group, Iowa State University, Ames, IA
50011, U.S.A.,
- http://dx.doi.org/10.1002/cpe.705
- light wiley document
- wiley 2003 full documents
- Abstract:MPI is commonly used to write parallel
programs for distributed memory parallel computers.
MPI-CHECK is a tool developed to aid in the debugging of MPI
programs that are written in free or fixed format Fortran 90
and Fortran 77. MPI-CHECK provides automatic compile-time
and run-time checking of MPI programs. MPI-CHECK
automatically detects the following problems in the use of
MPI routines: (i) mismatch in argument type, kind, rank or
number; (ii) messages which exceed the bounds of the
source/destination array; (iii) negative message lengths;
(iv) illegal MPI calls before MPI_INIT or after
MPI_FINALIZE; (v) inconsistencies between the declared type
of a message and its associated DATATYPE argument; and (vi)
actual arguments which violate the INTENT attribute.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE653:Kava: a Java dialect with a uniform object model
for lightweight classes
- Author(s):David F. Bacon
- IBM T.J. Watson
Research Center, P.O. Box 704, Yorktown Heights, NY
10598, U.S.A
http://dx.doi.org/10.1002/cpe.653
light wiley document
wiley 2003 full documents
Abstract:Object-oriented programming languages
have always distinguished between 'primitive' and
'user-defined' data types, and in the case of
languages like C++ and Java the primitives are not even
treated as objects, further fragmenting the programming
model. The distinction is especially problematic when a
particular programming community requires primitive-level
support for a new data type, as for complex, intervals,
fixed-point numbers, and so on. We present Kava, a design
for a backward-compatible version of Java that solves the
problem of programmable lightweight objects in a much more
aggressive and uniform manner than previous proposals. In
Kava, there are no primitive types; instead, object-oriented
programming is provided down to the level of single bits,
and types such as int can be explicitly programmed within
the language. While the language maintains a uniform object
reference semantics, efficiency is obtained by making heavy
use of unboxing and semantic
expansion. We describe Kava as a dialect of the
Java language, show how it can be used to define various
primitive types, describe how it can be translated into
Java, and compare it to other approaches to lightweight
objects. Copyright © 2003 John Wiley & Sons, Ltd.
CPE654:Framework for testing multi-threaded Java programs
- Author(s):Shmuel Ur ,Orit Edelstein,Eitan Farchi,Evgeny Goldin,Gil Ratsaby
- Correspondence to
Shmuel Ur, IBM Research Laboratory in Haifa,
- http://dx.doi.org/10.1002/cpe.654
- light wiley document
- wiley 2003 full documents
- Abstract:Finding bugs due to race conditions in
multi-threaded programs is difficult, mainly because there
are many possible interleavings, any of which may contain a
fault. In this work we present a methodology for testing
multi-threaded programs which has minimal impact on the user
and is likely to find interleaving bugs. Our method reruns
existing tests in order to detect synchronization faults. We
find that a single test executed a number of times in a
controlled environment may be as effective in finding
synchronization faults as many different tests. A great deal
of resources are saved since tests are very expensive to
write and maintain. We observe that simply rerunning tests,
without ensuring in some way that the interleaving will
change, yields almost no benefits. We implement the
methodology in our test generation tool - ConTest. ConTest
combines the replay algorithm, which is essential for
debugging, with our interleaving test generation heuristics.
ConTest also contains an instrumentation engine, a coverage
analyzer, and a race detector (not finished yet) that
enhance bug detection capabilities. The greatest advantage
of ConTest, besides finding bugs of course, is its minimal
effect on the user. When ConTest is combined into the test
harness, the user may not even be aware that ConTest is
being used. Copyright © 2003 John Wiley & Sons, Ltd.
CPE655:Integration and application of TAU in parallel
Java environments
- Author(s):Sameer Shende , Allen D. Malony
- Department of Computer
and Information Science, 1202 University of Oregon,
Eugene, OR 97403, U.S.A.,Department of Computer and
Information Science, 1202 University of Oregon,
Eugene, OR 97403, U.S.A.
http://dx.doi.org/10.1002/cpe.655
light wiley document
wiley 2003 full documents
Abstract:Parallel Java environments present
challenging problems for performance tools because of
Java's rich language system and its multi-level
execution platform combined with the integration of
native-code application libraries and parallel run-time
software. In addition to the desire to provide robust
performance measurement and analysis capabilities for the
Java language itself, the coupling of different software
execution contexts under a uniform performance model needs
careful consideration of how events of interest are observed
and how cross-context parallel execution information is
linked. This paper relates our experience in extending the
TAU (Tuning and Analysis Utilities) performance system to a
parallel Java environment based on mpiJava. We describe the
complexities of the instrumentation model used, how
performance measurements are made, and the overhead
incurred. A parallel Java application simulating the game of
Life is used to show the performance system's
capabilities. Copyright © 2003 John Wiley & Sons, Ltd.
CPE666:Platform independent dynamic Java virtual machine
analysis: the Java Grande Forum benchmark suite
- Author(s):John Waldron ,James Power ,David Gregg
- Department of Computer
Science, Trinity College, Dublin 2, Ireland,Department of Computer
Science, National University of Ireland, Maynooth,
Co. Kildare, Ireland,
- http://dx.doi.org/10.1002/cpe.666
- light wiley document
- wiley 2003 full documents
- Abstract:In this paper we present a platform
independent analysis of the dynamic profiles of Java
programs when executing on the Java Virtual Machine. The
Java programs selected are taken from the Java Grande Forum
benchmark suite and five different Java-to-bytecode
compilers are analysed. The results presented describe the
dynamic instruction usage frequencies, as well as the sizes
of the local variable, parameter and operand stacks during
execution on the JVM. These results, presenting a
picture of the actual (rather than presumed) behaviour of
the JVM, have implications both for the coverage aspects of
the Java Grande benchmark suites, for the performance of the
Java-to-bytecode compilers and for the design of the JVM.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE656:High-performance Java codes for computational
fluid dynamics
- Author(s):Christopher J. Riley ,Siddhartha Chatterjee ,Rupak Biswas
- Blue Ridge Numerics,
Inc., Charlottesville, VA 22901, U.S.A,IBM T. J. Watson Research
Center, Yorktown Heights, NY 10598, U.S.A.,NASA Advanced Supercomputing
Division, MS T27A-1, NASA Ames Research Center,
Moffett Field, CA 94035
http://dx.doi.org/10.1002/cpe.656
light wiley document
wiley 2003 full documents
Abstract:The computational science community is
reluctant to write large-scale computationally-intensive
applications in Java due to concerns over Java's poor
performance, despite the claimed software engineering
advantages of its object-oriented features. Naive Java
implementations of numerical algorithms can perform poorly
compared to corresponding Fortran or C implementations. To
achieve high performance, Java applications must be designed
with good performance as a primary goal. This paper presents
the object-oriented design and implementation of two
real-world applications from the field of computational
fluid dynamics (CFD): a finite-volume fluid flow solver
(LAURA, from NASA Langley Research Center) and an
unstructured mesh adaptation algorithm (2D_TAG, from NASA
Ames Research Center). This work builds on our previous
experience with the design of high-performance numerical
libraries in Java. We examine the performance of the
applications using the currently available Java
infrastructure and show that the Java version of the flow
solver LAURA performs almost within a factor of 2 of the
original procedural version. Our Java version of the mesh
adaptation algorithm 2D_TAG performs within a factor of 1.5
of its original procedural version on certain platforms. Our
results demonstrate that object-oriented software design
principles are not necessarily inimical to high performance.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE667:Java Virtual Machine support for object serialization
- Author(s):Fabian Breg ,Constantine D. Polychronopoulos
- University of
Illinois, Coordinated Science Laboratory, room 444,
1308 W. Main Street, Urbana, IL 61,
- http://dx.doi.org/10.1002/cpe.667
- light wiley document
- wiley 2003 full documents
- Abstract:Distributed computing has become
increasingly popular in the high-performance community.
Java's remote method invocation (RMI) provides a
simple, yet powerful method for implementing parallel
algorithms. The performance of RMI has been less than
adequate, however, and object serialization is often
identified as a major performance inhibitor. We believe that
object serialization is best performed in the Java Virtual
Machine (JVM), where information regarding object layout and
hardware communication resources are readily available. We
implement a subset of Java's object serialization
protocol in native code, using the Java Native Interface
(JNI) and JVM internals. Experiments show that our approach
is up to eight times faster than Java's original object
serialization protocol for array objects. Also, for linked
data structures our approach obtains a moderate speedup and
better scalability. Evaluation of our object serialization
implementation in an RMI framework indicates that a higher
throughput can be obtained. Parallel applications, written
using RMI, obtain better speedups and scalability when this
more efficient object serialization is used. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE657:Automatic translation of Fortran to JVM bytecode
- Author(s):Keith Seymour,Jack Dongarra
- Department of Computer
Science, University of Tennessee, Knoxville,
Knoxville, TN 37996, U.S.A.,
- http://dx.doi.org/10.1002/cpe.657
- light wiley document
- wiley 2003 full documents
- Abstract:This paper reports on the design of a
Fortran-to-Java translator whose target language is the
instruction set of the Java Virtual Machine. The goal of the
translator is to generate Java implementations of legacy
Fortran numerical codes in a consistent and reliable
fashion. The benefits of directly generating bytecode are
twofold. First, compared with generating Java source code,
it provides a much more straightforward and efficient
mechanism for translating Fortran GOTO statements. Second,
it provides a framework for pursuing various compiler
optimizations, which could be beneficial not only to our
project, but to the Java community as a whole. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE568:Benchmarking Java against C and Fortran for
scientific applications
- Author(s):J. M. Bull, L. A. Smith,L. Pottage,R. Freeman
- Edinburgh Parallel
Computer Centre, James Clerk Maxwell Building, The
King's Buildings, The University of Edinburgh,
Mayfield Road, Edinburgh EH9 3JZ, U.K.,
- http://dx.doi.org/10.1002/cpe.658
- light wiley document
- wiley 2003 full documents
- Abstract:Increasing interest is being shown in
the use of Java for scientific applications. The Java Grande
benchmark suite was designed with such applications
primarily in mind. The perceived lack of performance of Java
still deters many potential users, despite recent advances
in just-in-time and adaptive compilers. There are, however,
few benchmark results available comparing Java to more
traditional languages such as C and Fortran. To address this
issue, a subset of the Java Grande benchmarks has been
re-written in C and Fortran allowing direct performance
comparisons between the three languages. The performance of
a range of Java execution environments, C and Fortran
compilers have been tested across a number of platforms
using the suite. These demonstrate that on some platforms
(notable Intel Pentium) the performance gap is now quite
small. Copyright © 2003 John Wiley & Sons, Ltd.
CPE660:Run-time optimizations for a Java DSM implementation
- Author(s):R. Veldema,R. F. H. Hofman,R. A. F. Bhoedjang,H. E. Bal
- 1Department of Computer
Science, Vrije Universiteit, Amsterdam, The Netherlands,Department of Computer
Science, Cornell University, Ithaca, NY, USA,
- http://dx.doi.org/10.1002/cpe.660
- light wiley document
- wiley 2003 full documents
- Abstract:Jackal is a fine-grained distributed
shared memory implementation of the Java programming
language. Jackal implements Java's memory model and
allows multithreaded Java programs to run unmodified on
distributed-memory systems.
This paper focuses on
Jackal's run-time system, which implements a
multiple-writer, home-based consistency protocol. Protocol
actions are triggered by software access checks that
Jackal's compiler inserts before object and array
references. To reduce access-check overhead, the compiler
exploits source-level information and performs extensive
static analysis to optimize, lift, and remove access checks.
We describe optimizations for Jackal's run-time system,
which mainly consists of discovering opportunities to
dispense with flushing of cached data. We give performance
results for different run-time optimizations, and compare
their impact with the impact of one compiler optimization.
We find that our run-time optimizations are necessary for
good Jackal performance, but only in conjunction with the
Jackal compiler optimizations described by Veldema et al. As
a yardstick, we compare the performance of Java applications
run on Jackal with the performance of equivalent
applications that use a fast implementation of Java's
Remote Method Invocation instead of shared memory. Copyright
© 2003 John Wiley & Sons, Ltd.
CPE661:Supporting multidimensional arrays in Java
- Author(s):José E. Moreira,Samuel P. Midkiff,Manish Gupta
- IBM T. J. Watson
Research Center, Yorktown Heights, NY 10598-0218, U.S.A.,
- http://dx.doi.org/10.1002/cpe.661
- light wiley document
- wiley 2003 full documents
- Abstract:The lack of direct support for
multidimensional arrays in JavaTM has
been recognized as a major deficiency in the language's
applicability to numerical computing. It has been shown
that, when augmented with multidimensional arrays, Java can
achieve very high-performance for numerical computing
through the use of compiler techniques and efficient
implementations of aggregate array operations. Three
approaches have been discussed in the literature for
extending Java with support for multidimensional arrays:
class libraries that implement these structures; extending
the Java language with new syntactic constructs for
multidimensional arrays that are directly translated to
bytecode; and relying on the Java Virtual Machine to
recognize those arrays of arrays that are being used to
simulate multidimensional arrays. This paper presents a
balanced presentation of the pros and cons of each technique
in the areas of functionality, language and virtual machine
impact, implementation effort, and effect on performance. We
show that the best choice depends on the relative importance
attached to the different metrics, and thereby provide a
common ground for a rational discussion and comparison of
the techniques. Copyright © 2003 John Wiley & Sons, Ltd.
CPE662:CartaBlanca - a pure-Java, component-based systems
simulation tool for coupled nonlinear physics on
unstructured grids - an update
- Author(s):W. B. VanderHeyden,E. D. Dendy,N. T. Padial-Collins
- Los Alamos National
Laboratory, Theoretical Division and Los Alamos
Computer Science Institute, Los Alamos, NM 87545, U.S.A.,
- http://dx.doi.org/10.1002/cpe.662
- light wiley document
- wiley 2003 full documents
- Abstract:This paper describes a component-based
nonlinear physical system simulation prototyping package
written entirely in Java using object-oriented design. The
package provides scientists and engineers with a
'developer-friendly' software environment for
large-scale computational algorithm and physical model
development. The software design centers on the
Jacobian-free Newton-Krylov solution method surrounding a
finite-volume treatment of conservation equations. This
enables a clean component-like implementation. We first
provide motivation for the development of the software and
then discuss software structure. The discussion includes a
description of the use of Java's built-in thread
facility that enables parallel, shared-memory computations
on a wide variety of unstructured grids with triangular,
quadrilateral, tetrahedral and hexahedral elements. We also
discuss the use of Java's inheritance mechanism in the
construction of a hierarchy of physics systems objects and
linear and nonlinear solver objects that simplify
development and foster software re-use. We provide a brief
review of the Jacobian-free Newton-Krylov nonlinear system
solution method and discuss how it fits into our design.
Following this, we show results from example calculations
and then discuss plans including the extension of the
software to distributed-memory computer systems. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE663:Providing soft real-time quality of service
guarantees for Java threads
- Author(s):James C. Pang,Gholamali C. Shoja,Eric G. Manning
- Department of Computer
Science, University of Victoria, Victoria, BC,
Canada V8W 3P6,Department of
Electrical and Computer Engineering, University of
Victoria, Victoria, BC, Canada V8W 3P6
http://dx.doi.org/10.1002/cpe.663
light wiley document
wiley 2003 full documents
Abstract:The Java platform has many
characteristics that make it very desirable for integrated
continuous media processing. Unfortunately, it lacks the
necessary CPU resource management facilities to support
quality of service (QoS) guarantees for soft real-time
multimedia tasks. In this paper, we present our new Java
Virtual Machine, Q-JVM, which brings CPU resource management
to the Java platform. Q-JVM is based on Sun's JVM
version 1.1.5. It implements an enhanced version of the
MTR-LS algorithm in its thread scheduler. Combined with
admission control that could be implemented in an
application-level resource manager, it is able to support
QoS parameters such as fairness, bandwidth partitioning and
delay bound guarantees, as well as the cumulative service
guarantee. Our test results show that Q-JVM is backward
compatible with the standard JVM from Sun, has low
scheduling overhead, and is able to provide QoS guarantees
as specified. Copyright © 2003 John Wiley & Sons, Ltd.
CPE664:CCJ: object-based message passing and collective
communication in Java
- Author(s):Jason Maassen,Arnold Nelisse,Thilo Kielmann,Henri E. Bal
- Division of Mathematics
and Computer Science, Vrije Universiteit, Amsterdam,
The Netherlands,
- http://dx.doi.org/10.1002/cpe.664
- light wiley document
- wiley 2003 full documents
- Abstract:CCJ is a communication library that
adds MPI-like message passing and collective operations to
Java. Rather than trying to adhere to the precise MPI
syntax, CCJ aims at a clean integration of communication
into Java's object-oriented framework. For example, CCJ
uses thread groups to support Java's multithreading
model and it allows any data structure (not just arrays) to
be communicated. CCJ is implemented entirely in Java, on top
of RMI, so it can be used with any Java virtual machine. The
paper discusses three parallel Java applications that use
collective communication. It compares the performance (on
top of a Myrinet cluster) of CCJ, RMI and mpiJava versions
of these applications and also compares their code
complexity. A detailed performance comparison between CCJ
and mpiJava is given using the Java Grande Forum MPJ
benchmark suite. The results show that neither CCJ's
object-oriented design nor its implementation on top of RMI
impose a performance penalty on applications compared to
their mpiJava counterparts. The source of CCJ is available
from our Web site http://www.cs.vu.nl/manta. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE665:Supporting dynamic parallel object arrays
- Author(s):Orion S. Lawlor,Laxmikant V. Kalé
- Computer Science
Department, University of Illinois at
Urbana-Champaign, 1304 W. Springfield, 3315 DCL,
Urbana, IL 61801-2987, U.S.A.,
- http://dx.doi.org/10.1002/cpe.665
- light wiley document
- wiley 2003 full documents
- Abstract:We present efficient support for
generalized arrays of parallel data driven objects. Array
elements are regular C++ objects, and are scattered across
the parallel machine. An individual element is addressed by
its 'index', which can be an arbitrary object
rather than a simple integer. For example, an array index
can be a series of numbers, supporting multidimensional
sparse arrays; a bit vector, supporting collections of
quadtree nodes; or a string. Methods can be invoked on any
individual array element from any processor, and the
elements can participate in reductions and broadcasts.
Individual elements can be created or deleted dynamically at
any time. Most importantly, the elements can migrate from
processor to processor at any time. We discuss support for
message delivery and collective operations in the face of
such dynamic behavior. The migration capabilities of array
elements have proven extremely useful, for example, in
implementing flexible load balancing strategies and for
exploiting workstation clusters adaptively. We present the
design, an implementation, and performance results.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE712:Sapphire: copying garbage collection without
stopping the world
- Author(s):Eliot B. Moss,Richard L. Hudson
- Department of Computer
Science, University of Massachusetts, Amherst, MA
01003-4610, U.S.A.,Intel Corporation,
2200 Mission College Boulevard, Santa Clara, CA
95052-8119, U.S.A.
http://dx.doi.org/10.1002/cpe.712
light wiley document
wiley 2003 full documents
Abstract:The growing use in concurrent systems
of languages that require garbage collection (GC), such as
Java, is raising practical interest in concurrent GC.
Sapphire is a new algorithm for concurrent copying GC for
Java. It stresses minimizing the amount of time any given
application thread may need to block to support the
collector. In particular, Sapphire is intended to work well
in the presence of a large number of application threads, on
small- to medium-scale shared memory
multiprocessors. Sapphire extends previous
concurrent copying algorithms, and is most closely related
to replicating copying collection, a GC technique in which
application threads observe and update primarily the old
copies of objects. The key innovations of Sapphire are: (1)
the ability to 'flip' one thread at a time
(changing the thread's view from the old copies of
objects to the new copies), as opposed to needing to stop
all threads and flip them at the same time; (2) exploiting
Java semantics and assuming any data races occur on volatile
fields, to avoid a barrier on reads of non-volatile fields;
and (3) working in concert with virtually any underlying
(non-concurrent) copying collection algorithm. Copyright ©
2003 John Wiley & Sons, Ltd
CPE735:Editorial: ACM 2001 Java Grande-ISCOPE (JGI2001) Conference
- Author(s):Geoffrey C. Fox
http://dx.doi.org/10.1002/cpe.735
light wiley document
wiley 2003 full documents
Abstract:No abstract
CPE659:Spar: a set of extensions to Java for scientific computation
CPE715:Multi-wavelength image space: another Grid-enabled science
- Author(s):Roy Williams,Bruce Berriman,Ewa Deelman,John Good,Joseph Jacob,Carl Kesselman,Carol Lonsdale,Seb Oliver,Thomas A. Prince
- Center for Advanced
Computing Research, California Institute of
Technology 158-79, Pasadena, CA 91125, U.S.A.,2Infrared Processing
and Analysis Center, California Institute of
Technology 100-22, Pasadena, CA 91125, U.S.A.,Information Sciences
Institute, University of Southern California, 4676
Admiralty Way, Marina del Key, CA 90292, U.S.A.,Jet Propulsion
Laboratory, California Institute of Technology,
Pasadena, CA 91109, U.S.A,3Information Sciences
Institute, University of Southern California, 4676
Admiralty Way, Marina del Key, CA 90292, U.S.A.,Astronomy Centre,
University of Sussex, Falmer, Brighton BN1 9R, U.K.,
- http://dx.doi.org/10.1002/cpe.715
- light wiley document
- wiley 2003 full documents
- Abstract:We describe how the Grid enables new
research possibilities in astronomy through multi-wavelength
images. To see sky images in the same pixel space, they must
be projected to that space, a computer-intensive process.
There is thus a virtual data space induced that is defined
by an image and the applied projection. This virtual data
can be created and replicated with Planners and Replica
catalog technology developed under the GriPhyN project. We
plan to deploy our system (MONTAGE) on the U.S. Teragrid.
Grid computing is also needed for ingesting data - computing
background correction on each image - which forms a separate
virtual data space. Multi-wavelength images can be used for
pushing source detection and statistics by an order of
magnitude from current techniques; for optimization of
multi-wavelength image registration for detection and
characterization of extended sources; and for detection of
new classes of essentially multi-wavelength astronomical
phenomena. The paper discusses both the Grid architecture
and the scientific goals. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE716:CORBA request portable interceptors: analysis and applications
- Author(s):C. Marchetti,R. Baldoni,L. Verde
- Dipartimento di
Informatica e Sistemistica, Universitŕ di Roma La
Sapienza, Via Salaria 113, 00198, Roma, Italy,
- http://dx.doi.org/10.1002/cpe.716
- light wiley document
- wiley 2003 full documents
- Abstract:Interceptors are an emerging middleware
technology enabling the addition of specific
network-oriented capabilities to distributed applications.
By exploiting interceptors, developers can register code
within interception points, extending the basic middleware
mechanisms with specific functionality, e.g. authentication,
flow control, caching, etc. Notably, these extensions can be
achieved without modifying either the
application or the middleware code.
In this paper we
report the results of our experiences with CORBA request
portable interceptors. In particular, we point out (i) the
basic mechanisms implementable by these interceptors, i.e.
request redirection and piggybacking and (ii) we analyze
their limitations. We then propose a proxy-based technique
to overcome the interceptors' limitations.
Successively, we present a performance analysis carried out
on three Java-CORBA platforms currently implementing the
portable interceptors specification. Finally, we conclude
our work with a case study in which portable interceptors
are used to implement the fault-tolerant CORBA client
invocation semantic without impacting on the client
application code and on the CORBA ORB. We also release
fragments of Java code for implementing the described
techniques. Copyright © 2003 John Wiley & Sons, Ltd.
CPE717:An adaptive parallel genetic algorithm system for
i-Computing environment
- Author(s):Yuk-Yin Wong,Kin-Hong Lee,Kwong-Sak Leung
- Computing Services
Centre, City University of Hong Kong, Tat Chee
Avenue, Kowloon, Hong Kong,Department of Computer
Science and Engineering, Chinese University of Hong
Kong, Shatin, N.T., Hong Kong,
- http://dx.doi.org/10.1002/cpe.717
- light wiley document
- wiley 2003 full documents
- Abstract:Many real-world optimization problems
in the scientific and engineering fields can be solved by
genetic algorithms (GAs) but it still requires a long
execution time for complex problems. At the same time, there
are many under-utilized workstations on the Internet. In
this paper, we present a self-adaptive parallel GA system
named APGAIN, which utilizes the spare power of the
heterogeneous workstations on the Internet to solve complex
optimization problems. In order to maintain a balance
between exploitation and exploration, we have devised a
novel probabilistic rule-driven adaptive model (PRDAM) to
adapt the GA parameters automatically. APGAIN is implemented
on an Internet Computing system called DJM. In the
implementation, we discover that DJM's original load
balancing strategy is insufficient. Hence the strategy is
extended with the job migration capability. The performance
of the system is evaluated by solving the traveling salesman
problem with data from a public database. Copyright © 2003
John Wiley & Sons, Ltd.
CPE718:A scalable HPF implementation of a finite-volume
computational electromagnetics application on a CRAY T3E
parallel system
- Author(s):Yi Pan,Joseph J. S. Shang,Minyi Guo
- 1Department of
Computer Science, Georgia State University, Atlanta,
GA 30303, U.S.A,Air Vehicle
Directorate, Air Force Research Laboratory, WPAFB,
OH 45433-7521, U.S.A,Department of Computer
Software, The University of Aizu, Aizu-Wakamatsu
City, Fukushima, 965-8580 Japan
http://dx.doi.org/10.1002/cpe.718
light wiley document
wiley 2003 full documents
Abstract:The time-dependent Maxwell equations
are one of the most important approaches to describing
dynamic or wide-band frequency electromagnetic phenomena. A
sequential finite-volume, characteristic-based procedure for
solving the time-dependent, three-dimensional Maxwell
equations has been successfully implemented in Fortran
before. Due to its need for a large memory space and high
demand on CPU time, it is impossible to test the code for a
large array. Hence, it is essential to implement the code on
a parallel computing system. In this paper, we discuss an
efficient and scalable parallelization of the sequential
Fortran time-dependent Maxwell equations solver using High
Performance Fortran (HPF). The background to the project,
the theory behind the efficiency being achieved, the
parallelization methodologies employed and the experimental
results obtained on the Cray T3E massively parallel
computing system will be described in detail. Experimental
runs show that the execution time is reduced drastically
through parallel computing. The code is scalable up to 98
processors on the Cray T3E and has a performance similar to
that of an MPI implementation. Based on the experimentation
carried out in this research, we believe that a high-level
parallel programming language such as HPF is a fast, viable
and economical approach to parallelizing many existing
sequential codes which exhibit a lot of parallelism.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE720:Analysis of a prototype intelligent network interface
- Author(s):Keith D. Underwood,Walter B. Ligon III,Ron R. Sass
- Parallel Architecture
Research Laboratory, Holcombe Department of
Electrical and Computer Engineering, Clemson
University, 105 Riggs Hall, Clemson, SC 29634-0915, U.S.A.,
- http://dx.doi.org/10.1002/cpe.720
- light wiley document
- wiley 2003 full documents
- Abstract:With a focus on commodity PC systems,
Beowulf clusters traditionally lack the cutting edge network
architectures, memory subsystems, and processor technologies
found in their more expensive supercomputer counterparts.
Many users find that what Beowulf clusters lack in
technology, they more than make up for with their
significant cost advantage. In this paper, an architectural
extension that adds reconfigurable computing to the network
interface of Beowulf clusters is proposed. The proposed
extension, called an intelligent network interface card (or
INIC), enhances both the network and processor capabilities
of the cluster, which has a significant impact on the
performance of a crucial class of applications. Furthermore,
for some applications, the proposed extension partially
compensates for weaknesses in the PC memory subsystem. A
prototype of the proposed architecture was constructed and
analyzed. In addition, two applications, the 2D Fast Fourier
Transform (2D-FFT) and integer sorting, which benefit from
the resulting architecture, are discussed and analyzed on
the proposed architecture. Specifically, results indicate
that the 2D-FFT is performed 15-50% faster when using the
prototype INIC rather than the comparable Gigabit Ethernet.
Early simulation results also indicate that integer sorting
will receive a 21-32% performance boost from the prototype.
The significant improvements seen with a relatively limited
prototype lead to the conclusion that cluster network
interfaces enhanced with reconfigurable computing could
significantly improve the Beowulf architecture. Copyright ©
2003 John Wiley & Sons, Ltd.
CPE721:Design and implementation of a user-level Sockets
layer over Virtual Interface Architecture
- Author(s):Jin-Soo Kim,Kangho Kim,Sung-In Jung,Soonhoi Ha
- Division of Computer
Science, Korea Advanced Institute of Science and
Technology, Daejeon 305-701, South Korea,Computer Systems
Research Department, Electronics and
Telecommunications Research Institute, Daejeon
305-350, South Korea,School of Computer
Science and Engineering, Seoul National University,
Seoul 151-742, South Korea
http://dx.doi.org/10.1002/cpe.721
light wiley document
wiley 2003 full documents
Abstract:The Virtual Interface Architecture
(VIA) is an industry standard user-level communication
architecture for system area networks. The VIA provides a
protected, directly-accessible interface to a network
hardware, removing the operating system from the critical
communication path. In this paper, we design and implement a
user-level Sockets layer over VIA, named SOVIA (Sockets Over
VIA). Our objective is to use the SOVIA layer to accelerate
the existing Sockets-based applications with a reasonable
effort and to provide a portable and high-performance
communication library based on VIA to application
developers. SOVIA realizes comparable performance to
native VIA, showing a minimum one-way latency of 10.5
s
and a peak bandwidth of 814 Mbps on Giganet's cLAN. We
have shown the functional compatibility with the existing
Sockets API by porting File Transfer Protocol (FTP) and
Remote Procedure Call (RPC) applications over the SOVIA
layer. Compared to the Giganet's LAN Emulation (LANE)
driver which emulates TCP/IP inside the kernel, SOVIA easily
doubles the file transfer bandwidth in FTP and reduces the
latency of calling an empty remote procedure by 77% in RPC
applications. Copyright © 2003 John Wiley & Sons, Ltd.
CPE722:NPACI Rocks: tools and techniques for easily
deploying manageable Linux clusters
- Author(s):Greg Bruno,Philip M. Papadopoulos,Mason J. Katz
- San Diego
Supercomputer Center, University of California San
Diego, La Jolla, CA 92093-0505, U.S.A.,
- http://dx.doi.org/10.1002/cpe.722
- light wiley document
- wiley 2003 full documents
- Abstract:High-performance computing clusters
(commodity hardware with low-latency, high-bandwidth
interconnects) based on Linux are rapidly becoming the
dominant computing platform for a wide range of scientific
disciplines. Yet, straightforward software installation,
maintenance, and health monitoring for large-scale clusters
has been a consistent and nagging problem for non-cluster
experts. The NPACI Rocks distribution takes a fresh
perspective on management and installation of clusters to
dramatically simplify software version tracking and cluster
integration. NPACI Rocks incorporates the latest Red Hat
distribution (including security patches) with additional
cluster-specific software. Using the identical software
tools used to create the base distribution, users can
customize and localize Rocks for their site. Strong
adherence to widely-used (de facto) tools
allows Rocks to move with the rapid pace of Linux
development. Version 2.2.1 of the toolkit is available for
download and installation. Over 100 Rocks clusters have been
built by non-cluster experts at multiple institutions
(residing in various countries) providing a peak aggregate
of 2 TFLOPS of clustered computing. Copyright © 2003 John
Wiley & Sons, Ltd.
CPE723:p-Jigsaw: a cluster-based Web server with
cooperative caching support
- Author(s):Ge Chen,Cho-Li Wang,Francis C. M. Lau
- Department of Computer
Science and Information Systems, The University of
Hong Kong, Hong Kong,
- http://dx.doi.org/10.1002/cpe.723
- light wiley document
- wiley 2003 full documents
- Abstract:Clustering provides a viable approach
to building scalable Web systems with increased computing
power and abundant storage space for data and contents. In
this paper, we present a pure-Java-based parallel Web server
system,p-Jigsaw, which operates on a
cluster and uses the technique of cooperative
caching to achieve high performance. We introduce
the design of an in-memory cache layer, called
Global Object Space (GOS), for dynamic
caching of frequently requested Web objects. The GOS
provides a unified view of cluster-wide memory resources for
achieving location-transparent Web object accesses. The GOS
relies on cooperative caching to minimize disk accesses. A
requested Web object can be fetched from a server
node's local cache or a peer node's local cache,
with the disk serving only as the last resort. A prototype
system based on the W3C Jigsaw server has been implemented
on a 16-node PC cluster. Three cluster-aware cache
replacement algorithms were tested and evaluated. The
benchmark results show good speedups with a real-life access
log, proving that cooperative caching can have significant
positive impacts on the performance of cluster-based
parallel Web servers. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE724:Clusterfile: a flexible physical layout parallel
file system
- Author(s):Florin Isaila,Walter F. Tichy
- Fakultät für
Informatik, Universität Karlsruhe Postfach 6980,
76128 Karlsruhe, Germany,
- http://dx.doi.org/10.1002/cpe.724
- light wiley document
- wiley 2003 full documents
- Abstract:This paper presents Clusterfile, a
parallel file system that provides parallel file access on a
cluster of computers. We introduce a file partitioning model
that has been used in the design of Clusterfile. The model
uses a data representation that is optimized for
multidimensional array partitioning while allowing arbitrary
partitions. The paper shows how the file model can be
employed for file partitioning into both physical subfiles
and logical views. We also present how the conversion
between two partitions of the same file is implemented using
a general memory redistribution algorithm. We show how we
use the algorithm to optimize non-contiguous read and write
operations. The experimental results include performance
comparisons with the Parallel Virtual File System (PVFS) and
an MPI-IO implementation for PVFS. Copyright © 2003 John
Wiley & Sons, Ltd.
CPE725:Using multirail networks in high-performance clusters
- Author(s):Fabrizio Petrini,Salvador Coll,Eitan Frachtenberg,Adolfy Hoisie,Leonid Gurvits
- CCS-3 Modeling,
Algorithms and Informatics Group, Computer and
Computational Sciences Division, Los Alamos National
Laboratory, Los Alamos, CA, U.S.A.,
- http://dx.doi.org/10.1002/cpe.725
- light wiley document
- wiley 2003 full documents
- Abstract:Using multiple independent networks
(also known as rails) is an emerging technique which is
being used to overcome bandwidth limitations and enhance
fault tolerance of current high-performance parallel
computers. In this paper, we present and analyze various
algorithms to allocate multiple communication rails,
including static and dynamic allocation schemes. An
analytical lower bound on the number of rails required for
static rail allocation is shown. We also present an
extensive experimental comparison of the behavior of various
algorithms in terms of bandwidth and latency. We show that
striping messages over multiple rails can substantially
reduce network latency, depending on average message size,
network load and allocation scheme. The methods compared
include a static rail allocation, a basic round-robin rail
allocation, a local-dynamic allocation based on local
knowledge and a dynamic rail allocation that reserves both
communication endpoints of a message before sending it. The
last method is shown to perform better than the others at
higher loads: up to 49% better than local-knowledge
allocation and 37% better than the round-robin allocation.
This allocation scheme also shows lower latency and it
saturates at higher loads (for long enough messages). Most
importantly, this proposed allocation scheme scales well
with the number of rails and message size. In addition we
propose a hybrid algorithm that combines the benefits of the
local-dynamic allocation for short messages with those of
the dynamic algorithm for large messages. Copyright © 2003
John Wiley & Sons, Ltd.
CPE750:Editorial
http://dx.doi.org/10.1002/cpe.750
light wiley document
wiley 2003 full documents
Abstract:No abstract
CPE719:Comparing the performance of MPICH with
Cray's MPI and with SGI's MPI
- Author(s):Glenn R. Luecke,Marina Kraeva,Lili Ju
- High Performance
Computing Group, Iowa State University, Ames, Iowa
50011-2251, U.S.A.,
- http://dx.doi.org/10.1002/cpe.719
- light wiley document
- wiley 2003 full documents
- Abstract:The purpose of this paper is to compare
the performance of MPICH with the vendor Message Passing
Interface (MPI) on a Cray T3E-900 and an SGI Origin 3000.
Seven basic communication tests which include basic
point-to-point and collective MPI communication routines
were chosen to represent commonly-used communication
patterns. Cray's MPI performed better (and sometimes
significantly better) than Mississippi State
University's (MSU's) MPICH for small and medium
messages. They both performed about the same for large
messages, however for three tests MSU's MPICH was about
20% faster than Cray's MPI. SGI's MPI performed
and scaled better (and sometimes significantly better) than
MPICH for all messages, except for the scatter test where
MPICH outperformed SGI's MPI for 1 kbyte messages. The
poor scalability of MPICH on the Origin 3000 suggests there
may be scalability problems with MPICH. Copyright © 2003
John Wiley & Sons, Ltd.
CPE728:The LINPACK Benchmark: past, present and future
- Author(s):Jack J. Dongarra,Piotr Luszczek,Antoine
- 1University of
Tennessee, Department of Computer Science,
Knoxville, TN 37996-3450, U.S.A.,Sun Microsystems, Inc.,
Paris, France
http://dx.doi.org/10.1002/cpe.728
light wiley document
wiley 2003 full documents
Abstract:This paper describes the LINPACK
Benchmark and some of its variations commonly used to assess
the performance of computer systems. Aside from the LINPACK
Benchmark suite, the TOP500 and the HPL codes are presented.
The latter is frequently used to obtained results for TOP500
submissions. Information is also given on how to interpret
the results of the benchmark and how the results fit into
the performance evaluation process. Copyright © 2003 John
Wiley & Sons, Ltd.
CPE732:Enhancing OODB semantics to support browsing in an
OODB vocabulary representation
- Author(s):James Geller,Li-min Liu,Yehoshua Perl
- Computer Science
Department, New Jersey Institute of Technology,
Newark, NJ 07102, U.S.A.,1Department of Applied
Mathematics, Chung Yuan Christian University, 22,
Pu-Zen, Pu-Chung Li, Chung-Li, Taiwan, Republic of China,
- http://dx.doi.org/10.1002/cpe.732
- light wiley document
- wiley 2003 full documents
- Abstract:In previous work, we have modeled a
vocabulary given as a semantic network by an object-oriented
database (OODB). The OODB schema thus obtained provides a
compact abstract view of the vocabulary. This enables the
fast traversal of the vocabulary by a user. In the semantic
network vocabulary, the IS-A relationships express the
specialization hierarchy. In our OODB modeling of the
vocabulary, the SUBCLASS relationship expresses the
specialization hierarchy of the classes and supports the
inheritance of their properties. A typical IS-A path in the
vocabulary has a corresponding shorter SUBCLASS path in the
OODB schema.
In this paper we expose several
cases where the SUBCLASS hierarchy fails to fully correspond
to the IS-A hierarchy of the vocabulary. In these cases
there exist traversal paths in the semantic network for
which there are no corresponding traversal paths in the OODB
schema. The reason for this failure is the existence of some
IS-A relationships between concepts of two classes, which
are not connected by a SUBCLASS relationship. This
phenomenon weakens the accuracy of our modeling.
To rectify the situation we introduce a new OODB semantic
relationship IS-A to represent the existence of IS-A
relationships between concepts of a pair of classes which
are not connected via a SUBCLASS relationship. The resulting
schema contains both SUBCLASS relationships and IS-A
relationships which completely model the IS-A hierarchy of
the vocabulary. We define a mixed-class level traversal path
to contain either SUBCLASS or IS-A relationships.
Consequently, each traversal path in the semantic network
has a corresponding mixed traversal path in the OODB schema.
Hence the introduction of the semantic OODB IS-A
relationship improves the modeling of semantic network
vocabularies by OODBs. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE729:Implementation of the EARTH programming model on SMP clusters: a multi-threaded language and runtime system
- Author(s):G. Tremblay,C. J. Morrone,J. N. Amaral,G. R. Gao
- Département d`informatique, Université du Québec à Montréal, Montréal, QC, Canada,Computer Architecture and Parallel Systems Laboratory (CAPSL), Department of Electrical and Computer Engineering, University of Delaware, Newark, DE, U.S.A.,Department of Computing Science, University of Alberta, Edmonton, AB, Canada,
- http://dx.doi.org/10.1002/cpe.729
- light wiley document
- wiley 2003 full documents
- Abstract:
This paper describes the design and implementation of an Efficient Architecture for Running THreads (EARTH) runtime system for a multi-processor/multi-node cluster. The (EARTH) model was designed to support the efficient execution of parallel (multi-threaded) programs with irregular fine-grain parallelism using off-the-shelf computers. Implementing an EARTH runtime system requires an explicitly threaded runtime system. For portability, we built this runtime system on top of Pthreads under Linux and used sockets for inter-node communication. Moreover, in order to make the best use of the resources available on a cluster of symmetric multi-processors (SMP), this implementation enables the overlapping of communication and computation.
We used Threaded-C, a language designed to implement the programming model supported by the EARTH architecture. This language allows the expression of various levels of parallelism and provides the primitives needed to manage the required communication and synchronization. The Threaded-C programming language supports irregular fine-grain parallelism through a two-level hierarchy of threads and fibers. It also provides various synchronization and communication constructs that reflect the nature of EARTH`s fibers-non-preemptive execution with data-driven scheduling-as well as the extensive use of split-phase transactions on EARTH to execute long-latency operations. Copyright © 2003 John Wiley & Sons, Ltd.
CPE736:Editorial
- Author(s):Rizos Sakellariou
http://dx.doi.org/10.1002/cpe.736
light wiley document
wiley 2003 full documents
Abstract:No abstract
CPE737:Scalable causal message logging for wide-area environments
- Author(s):Keith Marzullo,Karan Bhatia,Lorenzo Alvisi
- Department of Computer
Science and Engineering, University of California,
La Jolla, CA 92093-0114, U.S.A.,NPACI, UCSD, MC 0505,
9500 Gilman Drive, La Jolla, CA 92093-0505, U.S.A.,Department of Computer
Sciences, The University of Texas at Austin, 1
University Station C0500, Austin, TX 78712-0233, U.S.A.
http://dx.doi.org/10.1002/cpe.737
light wiley document
wiley 2003 full documents
Abstract:Wide-area systems are gaining in
popularity as an infrastructure for running scientific
applications. From a fault tolerance perspective, these
environments are challenging because of their scale and
their variability. Causal message logging protocols have
attractive properties that make them suitable for these
environments. They spread fault tolerance information around
in the system providing high availability. This information
can also be used to replicate objects that are otherwise
inaccessible because of network partitions.
However, current causal message logging protocols do not
scale to thousands or millions of processes. We describe the
Hierarchical Causal Message Logging Protocol (HCML) that
uses a hierarchy of shared logging sites, or proxies, to
significantly reduce the space requirements as compared with
existing protocols. These proxies also act as caches for
fault tolerance information and reduce the overall message
overhead by as much as 50%. HCML also leverages differences
in bandwidth between processes that reduces overall message
latency by as much as 97%. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE738:Achieving portable and efficient parallel CORBA objects
- Author(s):Christian Pérez,Alexandre Denis,Thierry Priol
- IRISA/IFSIC, Campus de
Beaulieu, 35042 Rennes Cedex, France,IRISA/INRIA, Campus de
Beaulieu, 35042 Rennes Cedex, France,
- http://dx.doi.org/10.1002/cpe.738
- light wiley document
- wiley 2003 full documents
- Abstract:With the availability of Computational
Grids, new kinds of applications are emerging. They raise
the problem of how to program them on such computing
systems. In this paper, we advocate a programming model
based on a combination of parallel and distributed
programming models. Compared to previous approaches, this
work aims at bringing single program multiple data (SPMD)
programming into CORBA in a portable way. For example, we
want to interconnect two parallel codes by CORBA without
modifying either CORBA or the parallel communication API. We
show that such an approach does not entail any loss of
performance compared to previous approaches that required
modification to the CORBA standard. Moreover, using an ORB
that is able to exploit high-performance networks, we show
that portable parallel CORBA objects can efficiently make
use of such networks. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE739:HA-PSLS: a highly available parallel single-level
store system
- Author(s):Christine Morin,Anne-Marie Kermarrec
- IRISA/INRIA, Campus
universitaire de Beaulieu, 35042 Rennes Cedex, France,Microsoft Research, 7 J
J Thomson Avenue, Cambridge CB3 0FB, U.K
http://dx.doi.org/10.1002/cpe.739
light wiley document
wiley 2003 full documents
Abstract:Parallel single-level store (PSLS)
systems integrate a shared virtual memory and a parallel
file system. They provide programmers with a global address
space including both memory and file data. PSLS systems
implemented in a cluster thus represent a natural support
for long-running parallel applications, combining both the
natural shared memory programming model and a large and
efficient file system. However, the need to
tolerate failures in such a system increases with the size
of applications. In this paper we present a highly-available
parallel single level store system (HA-PSLS), which smoothly
integrates a backward error recovery high-availability
mechanism into a PSLS system. Our system is able to tolerate
multiple transient failures, a single permanent failure, and
power cut failures affecting the whole cluster, without
requiring any specialized hardware. For this purpose,
HA-PSLS relies on a high degree of integration (and
reusability) of high-availability and standard features. A
prototype integrating our high-availability support has been
implemented and we show some performance results in the
paper. Copyright © 2003 John Wiley & Sons, Ltd.
CPE740:Parallel application of a novel domain
decomposition preconditioner for the adaptive finite-element
solution of three-dimensional convection-dominated PDEs
- Author(s):P. K. Jimack,S. A. Nadeem
- Computational PDEs
Unit, School of Computing, University of Leeds,
Leeds LS2 9JT, U.K.,
- http://dx.doi.org/10.1002/cpe.740
- light wiley document
- wiley 2003 full documents
- Abstract:We describe and analyse the parallel
implementation of a novel domain decomposition
preconditioner for the fast iterative solution of linear
systems of algebraic equations arising from the
discretization of elliptic partial differential equations
(PDEs) in three dimensions. In previous theoretical work,
this preconditioner has been proved to be optimal for
symmetric positive-definite (SPD) linear systems. In this
paper, we provide details of our three-dimensional parallel
implementation and demonstrate that the technique may be
generalized to the solution of non-symmetric algebraic
systems, such as those arising when convection-diffusion
problems are discretized using either Galerkin or stabilized
finite-element methods (FEMs). Furthermore, we illustrate
the potential of the preconditioner for use within an
adaptive finite-element framework by successfully solving
convection-dominated problems on locally, rather than
globally, refined meshes. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE741:A distributed object infrastructure for
interaction and steering
- Author(s):Manish Parashar,Rajeev Muralidhar
- The Applied Software
Systems Laboratory, Department of Electrical and
Computer Engineering, Rutgers, The State University
of New Jersey, 94 Brett Road, Piscataway, NJ 08854, U.S.A.,
- http://dx.doi.org/10.1002/cpe.741
- light wiley document
- wiley 2003 full documents
- Abstract:This paper presents the design,
implementation and experimental evaluation of DIOS
(Distributed Interactive Object Substrate), an interactive
object infrastructure to enable the runtime monitoring,
interaction and computational steering of parallel and
distributed applications. DIOS enables application objects
(data structures, algorithms) to be enhanced with sensors
and actuators so that they can be interrogated and
controlled. Application objects may be distributed (spanning
many processors) and dynamic (be created, deleted, changed
or migrated). Furthermore, DIOS provides a control network
that interconnects the interactive objects in a
parallel/distributed application and enables external
discovery, interrogation, monitoring and manipulation of
these objects at runtime. DIOS is currently being used to
enable interactive visualization, monitoring and steering of
a wide range of scientific applications, including oil
reservoir, compressible turbulence and numerical relativity
simulations. Copyright © 2003 John Wiley & Sons, Ltd.
CPE742:Load redundancy elimination on executable code
- Author(s):Manel Fernández,Roger Espasa,Saumya Debray
- Computer Architecture
Department, Universitat Politčcnica de Catalunya,
Barcelona, Spain,Department of Computer
Science, University of Arizona, Tucson, AZ 08034, U.S.A.
http://dx.doi.org/10.1002/cpe.742
light wiley document
wiley 2003 full documents
Abstract:Optimizations performed at link time or
directly applied to final program executables have received
increased attention in recent years. This paper discusses
the discovery and elimination of redundant load operations
in the context of a link-time optimizer, an optimization
that we call Load Redundancy Elimination
(LRE). Our experiments show that between 50% and
75% of a program's memory references can be considered
redundant because they are accessing memory locations that
have been referenced less than 200-400 instructions away. We
then present three profile-based LRE algorithms targeted at
optimizing away these redundancies. Our results show that
between 5% and 30% of the redundancy detected can indeed be
eliminated, which translates into program speedups of around
8%. We also test our algorithm assuming different cache
latencies, and show that, if latencies continue to grow, the
load redundancy elimination will become more important.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE778:SCALEA: a performance analysis tool for parallel programs
- Author(s):Hong-Linh Truong,Thomas Fahringer
- Institute for Software
Science, University of Vienna, Liechtensteinstrasse
22, A-1090 Vienna, Austria,
- http://dx.doi.org/10.1002/cpe.778
- light wiley document
- wiley 2003 full documents
- Abstract:Many existing performance analysis
tools lack the flexibility to control instrumentation and
performance measurement for code regions and performance
metrics of interest. Performance analysis is commonly
restricted to single experiments.
In this paper
we present SCALEA, which is a performance instrumentation,
measurement, analysis, and visualization tool for parallel
programs that supports post-mortem performance analysis.
SCALEA currently focuses on performance analysis for OpenMP,
MPI, HPF, and mixed parallel programs. It computes a variety
of performance metrics based on a novel classification of
overhead. SCALEA also supports multi-experiment performance
analysis that allows one to compare and to evaluate the
performance outcome of several experiments. A highly
flexible instrumentation and measurement system is provided
which can be controlled by command-line options and program
directives. SCALEA can be interfaced by external tools
through the provision of a full Fortran90 OpenMP/MPI/HPF
frontend that allows one to instrument an abstract syntax
tree at a very high-level with C-function calls and to
generate source code. A graphical user interface is provided
to view a large variety of performance metrics at the level
of arbitrary code regions, threads, processes, and
computational nodes for single- and multi-experiment
performance analysis. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE779:Deep Start: a hybrid strategy for automated
performance problem searches
- Author(s):Philip C. Roth,Barton P. Miller
- Computer Sciences
Department, University of Wisconsin-Madison,
Madison, WI 53706, U.S.A.,
- http://dx.doi.org/10.1002/cpe.779
- light wiley document
- wiley 2003 full documents
- Abstract:To attack the problem of scalability of
performance diagnosis tools with respect to application code
size, we have developed the Deep Start search strategy - a
new technique that uses stack sampling to augment an
automated search for application performance problems. Our
hybrid approach locates performance problems more quickly
and finds performance problems hidden from a more
straightforward search strategy. The Deep Start strategy
uses stack samples collected as a by-product of normal
search instrumentation to select deep
starters, functions that are likely to be
application bottlenecks. With priorities and careful control
of the search refinement, our strategy gives preference to
experiments on the deep starters and their callees. This
approach enables the Deep Start strategy to find application
bottlenecks more efficiently and more effectively than a
more straightforward search strategy. We implemented the
Deep Start search strategy in the Performance Consultant,
Paradyn's automated bottleneck detection component. In
our tests, Deep Start found half of our test
applications' known bottlenecks between 32% and 59%
faster than the Performance Consultant's current search
strategy, and finished finding bottlenecks between 10% and
61% faster. In addition to improving the search time, Deep
Start often found more bottlenecks than the call graph
search strategy. Copyright © 2003 John Wiley & Sons, Ltd.
CPE781:A self-stabilizing token-based
k-out-of-
exclusion algorithm
- Author(s):A. K. Datta,R. Hadid,V. Villain
- School of Computer
Science, University of Nevada Las Vegas, Las Vegas,
NV 89154-4019, U.S.A,LaRIA, Université de
Picardie Jules Verne, 33 rue Saint Leu, 80039 Amiens
Cedex, France,
- http://dx.doi.org/10.1002/cpe.781
- light wiley document
- wiley 2003 full documents
- Abstract:In this paper, we present the first
self-stabilizing solution to the
k-out-of-
exclusion problem on a ring. The
k-out-of-
exclusion problem is a
generalization of the well-known mutual exclusion
problem - there are
units of the shared resources, any
process can request k
units of the shared resources, and no resource unit can be
allocated to more than one process at one time. The space
requirement of the proposed algorithm is independent of
for all processors except a special
processor, called Root. The stabilization time is only 5n,
where n is the size of the ring. Copyright © 2003 John Wiley
& Sons, Ltd.
CPE782:Fairness in systems based on multiparty interactions
- Author(s):David Ruiz,Rafael Corchuelo,Miguel Toro
- ETSI Informática, Avda.
de la Reina Mercedes s/n, Sevilla E-41012, Spain,
- http://dx.doi.org/10.1002/cpe.782
- light wiley document
- wiley 2003 full documents
- Abstract:In the context of the Multiparty
Interaction Model, fairness is used to insure that an
interaction that is enabled sufficiently often in a
concurrent program will eventually be selected for
execution. Unfortunately, this notion does not take
conspiracies into account, i.e. situations in which an
interaction never becomes enabled because of an unfortunate
interleaving of independent actions; furthermore, eventual
execution is usually too weak for practical purposes since
this concept can only be used in the context of infinite
executions. In this article, we present a new fairness
notion, k-conspiracy-free fairness, that improves on others
because it takes finite executions into account, alleviates
conspiracies that are not inherent to a program, and k may
be set a priori to control its goodness to address the
above-mentioned problems. Copyright © 2003 John Wiley &
Sons, Ltd.
CPE783:Weak communication in single-hop radio networks:
adjusting algorithms to industrial standards
- Author(s):Tomasz Jurdzi
ski,Miros
aw Kuty
owski,Jan Zatopia
ski
- Institute of Computer
Science, Wroc
aw University, Wroc
aw, Poland,
- http://dx.doi.org/10.1002/cpe.783
- light wiley document
- wiley 2003 full documents
- Abstract:Quite often algorithms designed for
no-collision-detection radio networks use a hidden form of
collision detection: it is assumed that a station can
simultaneously send and listen. If it cannot hear its own
message, apparently the message has been scrambled by
another station sending at the same time.
Industrial standard IEEE 802.11 says that a station can
either send or listen to a radio channel at a given time,
but not both. In order to relate the industrial standard and
theoretical algorithms we consider a weak
radio network model with no collision detection in which a
station cannot simultaneously send and receive signals.
Otherwise we talk about a strong
model.
In this paper we consider a
measure called energy cost (or
'power consumption') which is equal to the maximum
over all stations of the number of steps in which the
station is sending or listening.
We show that
computational power of weak and strong single-hop radio
networks differ substantially in the deterministic case:
deterministic leader election requires
energy cost in the weak model and can be solved by a
practical algorithm with
energy cost in the strong model. By contrast, we present a
very efficient randomized simulation of strong radio
networks by weak ones, with preprocessing that requires
steps and has energy cost
. Copyright © 2003 John Wiley & Sons, Ltd.
CPE784:Performance analysis and qualitative results of an
efficient parallel stochastic simulator for a marine
host-parasite system
- Author(s):G. Latu,M. Langlais,J. Roman,P. Silan
- LaBRI, UMR CNRS 5800,
Université Bordeaux 1 et ENSEIRB, 351, cours de la
Libération, 33405 Talence, France,MAB, UMR CNRS 5466,
Université Bordeaux 2, 146 Léo Saignat, 33076
Bordeaux Cedex, France,,UMR CNRS 5000,
Université Montpellier II, Station Méditerranéenne
de lÉnvironnement Littoral, 1 Quai de la Daurade,
34200 Sčte, France
http://dx.doi.org/10.1002/cpe.784
light wiley document
wiley 2003 full documents
Abstract:We are interested in a host-parasite
system, i.e. the sea bass-Diplectanum
aequans system. A discrete mathematical model is used to
describe the dynamics of both populations. Our goal is
notably to validate the model in the context of aquaculture.
A deterministic numerical simulator and, recently, a
stochastic simulator were developed to study this biological
system. Parallelization is required because the execution
times are too long. The Monte Carlo algorithm of the
stochastic simulator and its three levels of parallelism are
described. Analysis and performances, up to 256 processors,
of a hybrid MPI/OpenMP code are then presented for a cluster
of symmetric multi-processor (SMP) nodes. Qualitative
results are given for the host-macroparasite system
simulation. Copyright © 2003 John Wiley & Sons, Ltd.
CPE777:Special Issue: Euro-Par 2002
CPE785:Parallel iterative methods for Navier-Stokes equations and application to eigenvalue computation
- Author(s):Ivan G. Graham,Alastair Spence,Eero Vainikko
- Department of Mathematical Sciences, University of Bath, U.K.,Institute of Technology, University of Tartu, Estonia
http://dx.doi.org/10.1002/cpe.785
light wiley document
wiley 2003 full documents
Abstract:
We describe the construction of parallel iterative solvers for finite-element approximations of the Navier–Stokes equations on unstructured grids using domain decomposition methods. The iterative method used is FGMRES, preconditioned by a parallel adaptation of a block preconditioner recently proposed by Kay <I>et al.</I> The parallelization is achieved by adapting the technology of our domain decomposition solver <MONO>DOUG</MONO> (previously used for scalar problems) to block-systems. The iterative solver is applied to shifted linear systems that arise in eigenvalue calculations. To illustrate the performance of the solver, we compare several strategies both theoretically and practically for the calculation of the eigenvalues of large sparse non-symmetric matrices arising in the assessment of the stability of flow past a cylinder. Copyright © 2003 John Wiley & Sons, Ltd.
CPE786:Dynamic query scheduling in parallel data warehouses
- Author(s):Holger Märtens,Erhard Rahm,Thomas Stöhr
- University of Applied Sciences Braunschweig/Wolfenbüttel, Germany,University of Leipzig, Institute of Computer Science, Augustusplatz 10-11, D-04109 Leipzig, Germany,
- http://dx.doi.org/10.1002/cpe.786
- light wiley document
- wiley 2003 full documents
- Abstract:
Parallel processing is a key to high performance in very large data warehouse applications that execute complex analytical queries on huge amounts of data. Although parallel database systems (PDBSs) have been studied extensively in the past decades, the specifics of load balancing in parallel data warehouses have not been addressed in detail.
In this study, we investigate how the load balancing potential of a Shared Disk (SD) architecture can be utilized for data warehouse applications. We propose an integrated scheduling strategy that simultaneously considers both processors and disks, regarding not only the total workload on each resource but also the distribution of load over time. We evaluate the performance of the new method in a comprehensive simulation study and compare it to several other approaches. The analysis incorporates skew aspects and considers typical data warehouse features such as star schemas with large fact tables and bitmap indices. Copyright © 2003 John Wiley & Sons, Ltd.
CPE730:Resource management in open Linda systems
- Author(s):Ronaldo Menezes
- Florida Institute of
Technology, Department of Computer Sciences, 150
West University Boulevard, Melbourne, FL 32901, U.S.A.
http://dx.doi.org/10.1002/cpe.730
light wiley document
wiley 2003 full documents
Abstract:Coordination systems, in particular
Linda, have established themselves as important tools for
the development of applications to open systems such as the
Internet. This paper shows how to tackle a
forgotten, but crucial problem in open coordination systems:
memory management. As with any system which intends to be of
wide use and because memory is a finite resource,
coordination systems must address the problems of memory
exhaustion. This paper first explores the orthogonality
between coordination and computation in order to make it
clear that the problem of memory exhaustion in coordination
systems cannot be solved using garbage
collection schemes implemented at the computation language -
a garbage collection scheme must exist in the coordination
environment as well.
Following the explanation on
orthogonality, the paper will focus on describing a garbage
collection scheme for the Linda family of coordination
systems. It is expected that the solution in Linda can be
adapted to other coordination systems as long as they are
based on tuple space communication. Copyright © 2003 John
Wiley & Sons, Ltd.
CPE726:A parallel data assimilation model for
oceanographic observations
- Author(s):Aad J. van der Steen,Fons van Hees,Peter Jan van Leeuwen
- High Performance
Computing Group, Utrecht University, P.O. Box 80195,
3508 TD Utrecht, The Netherlands,1High Performance
Computing Group, Utrecht University, P.O. Box 80195,
3508 TD Utrecht, The Netherlands,Institute for Marine
and Atmospheric Research Utrecht, Utrecht
University, P.O. Box 80005, 3508 TA Utrecht, The Netherlands
http://dx.doi.org/10.1002/cpe.726
light wiley document
wiley 2003 full documents
Abstract:In this paper we describe the
development of a program that aims at achieving the optimal
integration of observed data in an oceanographic model
describing the water transport phenomena in the Agulhas area
at the tip of South Africa. Two parallel implementations,
MPI and OpenMP, are described and experiments with respect
to speed and scalability on a Compaq AlphaServer SC and an
SGI Origin3000 are reported. Copyright © 2003 John Wiley
& Sons, Ltd.
CPE727:Coordinating components in middleware systems
- Author(s):Matthias Radestock,Susan Eisenbach
- LShift, Burbage House, 83-85 Curtain Road, London EC2A 3BS, U.K.,Department of Computing, Imperial College London, London SW7 2BZ, U.K.
http://dx.doi.org/10.1002/cpe.727
light wiley document
wiley 2003 full documents
Abstract:
Configuration
coordination
CPE731:A set-oriented method definition language for
object databases and its semantics
- Author(s):Giovanna Guerrini,Elisa Bertino,Isabella Merlo
- Dipartimento di
Informatica e Scienze dell'Informazione,
Universitŕ degli Studi di Genova, Via Dodecaneso 35
- I16146 Genova, Italy,Dipartimento di Scienze
dell'Informazione, Universitŕ degli Studi di
Milano, Via Comelico 39 - I20135 Milano, Italy,
- http://dx.doi.org/10.1002/cpe.731
- light wiley document
- wiley 2003 full documents
- Abstract:In this paper we propose a set-oriented
rule-based method definition language for object-oriented
databases. Most existing object-oriented database systems
exploit a general-purpose imperative object-oriented
programming language as the method definition language.
Because methods are written in a general-purpose imperative
language, it is difficult to analyze their properties and to
optimize them. Optimization is important when dealing with a
large amount of objects as in databases. We therefore
believe that the use of an ad hoc,
set-oriented language can offer some advantages, at least at
the specification level. In particular, such a language can
offer an appropriate framework to reason about method
properties.
In this paper, besides defining a
set-oriented rule-based language for method definition, we
formally define its semantics, addressing the problems of
inconsistency and non-determinism in set-oriented updates.
Moreover, we characterize some relevant properties of
methods, such as conflicts among method specifications in
sibling classes and behavioral refinement in subclasses.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE788:Optimizing operating system performance for CC-NUMA architectures
- Author(s):Moon Seok Chang
- ETRI, 161 Gajeong-Dong, Yuseong-Gu, Daejon 305-350, Korea
http://dx.doi.org/10.1002/cpe.788
light wiley document
wiley 2003 full documents
Abstract:
The CC-NUMA (cache-coherent non-uniform memory access) architecture is an attractive solution to scalable servers. The performance of a CC-NUMA system heavily depends on the number of accesses to remote memory through an interconnection network. To reduce the number of remote accesses, an operating system needs to exploit the potential locality of the architecture. This paper describes the design and implementation of a UNIX-based operating system supporting the CC-NUMA architecture. The operating system implements various enhancements by revising kernel algorithms and data structures. This paper also analyzes the performance of the enhanced operating system by running commercial benchmarks on a real CC-NUMA system. The performance analysis shows that the operating system can achieve improved performance and scalability for CC-NUMA by implementing kernel data striping, localization and load balancing. Copyright © 2003 John Wiley & Sons, Ltd.
CPE733:Frameworks for incorporating semantic
relationships into object-oriented database systems
- Author(s):Michael Halper,Li-min Liu,James Geller,Yehoshua Perl
- Department of
Mathematics and Computer Science, Kean University,
Union, NJ 07083, U.S.A.,2Department of Applied
Mathematics, Chung Yuan Christian University,
Chung-Li, Taiwan, Republic of China,Computer Science
Department, New Jersey Institute of Technology,
Newark, NJ 07102, U.S.A.,
- http://dx.doi.org/10.1002/cpe.733
- light wiley document
- wiley 2003 full documents
- Abstract:A semantic relationship is a data
modeling construct that connects a pair of classes or
categories and has inherent constraints and other
functionalities that precisely reflect the characteristics
of the specific relationship in an application domain.
Examples of semantic relationships include part-whole,
ownership, materialization and role-of. Such relationships
are important in the construction of information models for
advanced applications, whether one is employing traditional
data-modeling techniques, knowledge-representation languages
or object-oriented modeling methodologies. This paper
focuses on the issue of providing built-in support for such
constructs in the context of object-oriented database (OODB)
systems. Most of the popular object-oriented modeling
approaches include some semantic relationships in their
repertoire of data-modeling primitives. However, commercial
OODB systems, which are frequently used as implementation
vehicles, tend not to do the same. We will present two
frameworks by which a semantic relationship can be
incorporated into an existing OODB system. The first only
requires that the OODB system support manifest type with
respect to its instances. The second assumes that the OODB
system has a special kind of metaclass facility. The two
frameworks are compared and contrasted. In order to ground
our work in existing systems, we show the addition of a
part-whole semantic relationship both to the ONTOS
DB/Explorer OODB system and the VODAK Model Language.
Copyright © 2003 John Wiley & Sons, Ltd.
CPE743:Performance comparison of checkpoint and recovery protocols
- Author(s):Himadri Sekhar Paul,Arobinda Gupta,R. Badrinath
- Department of Computer
Science and Engineering, Indian Institute of
Technology, Kharagpur 721302, India,
- http://dx.doi.org/10.1002/cpe.743
- light wiley document
- wiley 2003 full documents
- Abstract:Checkpoint and rollback recovery is a
well-known technique for providing fault tolerance to
long-running distributed applications. Performance of a
checkpoint and recovery protocol depends on the
characteristics of the application and the system on which
it runs. However, given an application and system
environment, there is no easy way to identify which
checkpoint and recovery protocol will be most suitable for
it. Conventional approaches require implementing the
application with all the protocols under consideration,
running them on the desired system, and comparing their
performances. This process can be very tedious and time
consuming. This paper first presents the design and
implementation of a simulation environment, distributed
process simulation or dPSIM, which enables easy
implementation and evaluation of checkpoint and recovery
protocols. The tool enables the protocols to be simulated
under a wide variety of application, system, and network
characteristics. The paper then presents performance
evaluation of five checkpoint and recovery protocols. These
protocols are implemented and executed in dPSIM under
different simulated application, system, and network
characteristics. Copyright © 2003 John Wiley & Sons, Ltd.