CPE544:Special Issue: High Performance Agent Systems
CPE545:Dynamic configuration of access control for mobile components in FarGo
- Author(s):Yoad Gidron,Israel Ben-Shaul,Ophir Holder,Yariv Aridor
- Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa, 32000, Israel,IBM Research Laboratory in Haifa, MATAM, 31905 Haifa, Israel
http://dx.doi.org/10.1002/cpe.545
light wiley document
wiley 2002 full documents
Abstract:
Component mobility is an important enabling technology for the design of wide area pervasive applications, but it introduces new challenges in the critical aspect of access control. In particular, when mobility is used for dynamic relocation of distributed components, access from both remote and local mobile components needs to be uniformly controlled. The dynamic determination of execution location, possibly crossing multiple administrative authorities, requires dynamic establishment and enforcement of access control. The deployment over widely heterogeneous hosts and devices requires integration of access control with dynamic probing of resource availability so as to influence the relocation process.
This paper presents a model for dynamic specification and enforcement of access control in the context of dynamically relocatable components, and an implementation in the Java-based FarGo framework. The specification follows a negotiation-based protocol that enables dynamic matching of available and required resources by providers and consumers, respectively. Enforcement is provided through a capability-based secure component reference architecture, which uniformly applies to both local and remote references, and through instance-level, as opposed to type-level (supported in Java), access control. Finally, access control is integrated into the programming model in a non-intrusive fashion, by separating the encoding of access control from the encoding of the logic of the application. Copyright © 2001 John Wiley & Sons, Ltd.
CPE546:Performance investigation of an on-line auction system
- Author(s):Jane Hillston,Leïla Kloul
- LFCS, University of Edinburgh, Kings Buildings, Edinburgh EH9 3JZ, U.K.,PRiSM, Université de Versailles, 45, Av. des Etats-Unis, 78035 Versailles Cedex, France
http://dx.doi.org/10.1002/cpe.546
light wiley document
wiley 2002 full documents
Abstract:
The standard design of on-line auction systems places most of the computational load on the server and its adjacent links, resulting in a bottleneck in the system. In this paper, we investigate the impact, in terms of the performance of the server and its adjacent links, of introducing active nodes into the network. The performance study of the system is done using the stochastic process algebra formalism PEPA. Copyright © 2001 John Wiley & Sons, Ltd.
CPE547:Fault-tolerant holonic manufacturing systems
- Author(s):Martyn Fletcher,S. Misbah Deen
- Data and Knowledge Engineering Group, Department of Computer Science, Keele University, Keele, Staffordshire ST5 5BG, U.K.,
- http://dx.doi.org/10.1002/cpe.547
- light wiley document
- wiley 2002 full documents
- Abstract:
This paper presents a model of fault-tolerant holonic manufacturing systems (HMS) where each holon"s activities are controlled by an intelligent software agent. Multiple agents schedule actions, resolve conflicts and manage information to produce, transport, assemble, inspect and store customized products. Our model provides robustness and distribution transparency across a shop-floor where unpredictable failures occur with machines, control software and communication networks. Each autonomous holon is composed of a hierarchy of large-grain functional components where interaction is carried out by user-defined cooperation strategies. These strategies enable holons to coordinate their behaviour through exchanging messages and sensing/actuating of their shared environment. Therefore, holonic agents can select suitable rescheduling and recovery mechanisms to tolerate faults and keep the manufacturing system working. We also propose how the IEC 1499 standard (Function Block Architecture) for distributed control systems could be used to implement our model. The model presented here is a crystallization of some abstract concepts from a generic cooperating agent system, with suitable extensions to meet the criteria of the ongoing HMS project. Copyright © 2001 John Wiley & Sons, Ltd.
CPE548:Which paradigm should I use? An analytical comparison of the client-server, remote evaluation and mobile agent paradigms
- Author(s):Antonio Puliafito,Salvatore Riccobene,Marco Scarpa
- Dipartimento di Matematica - Contrada Papardo, S. Speroni, 98166 Messina, Italy,Dipartimento di Matematica - V. le A. Doria 6, 95125, Catania, Italy,Istituto di Informatica e Telecomunicazioni - V. le A. Doria 6, 95125, Catania, Italy
http://dx.doi.org/10.1002/cpe.548
light wiley document
wiley 2002 full documents
Abstract:
In this paper we deal with the study of the actual convenience of using the agent programming paradigm for accesssing distributed service. We try to point out the benefits of such a communication paradigm, by providing an analytical study of its basic features in comparison with the client-server approach and remote evaluation. The aim of the paper is to show how the Petri net analysis technique can be used for deciding whether to use traditional client/server, remote evaluation or mobile agents paradigm in designing a particular evaluation. So, we present several models of non-Markovian Petri nets, which have been solved through the WebSPN tool, and we provide a close comparison between the agents technique, the client-server and the remote evaluation communication paradigm. The results that we have obtained show how agents must not always be considered the only solution to any communication issue, since in several cases their use might even reveal a drawback. We also focus out attention on providing some practical remarks, which can help the developer during the design in order to select the communication paradigm which best suits the features of the application that has to be developed. Copyright © 2001 John Wiley & Sons, Ltd.
CPE582:Editorial: Concurrency and Computation: Practice and Experience
CPE539:A parallel Viterbi decoding algorithm
- Author(s):J. S. Reeve
- Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.
http://dx.doi.org/10.1002/cpe.539
light wiley document
wiley 2002 full documents
Abstract:
In this paper we express the Viterbi algorithm as a matrix-vector reduction in which multiplication is replaced by addition and addition by minimization. The resulting algorithm is then readily parallelized in a form suitable for implementation on a systolic processor array. We describe the algorithm for Bose-Chaudhuri-Hocquenghem (BCH) codes which have a task graph with its valence restricted to four inputs and four outputs. The method is also applicable to convolution codes, but the complexity of the task graph increases with the number of input bits for these codes. Results for BCH codes are given for two general purpose parallel machines, an IBM SP2 and a Meiko CS2. Copyright © 2001 John Wiley & Sons, Ltd.
CPE549:Emmerald: a fast matrix-matrix multiply using Intel"s SSE instructions
- Author(s):Douglas Aberdeen,Jonathan Baxter
- Research School of Information Sciences and Engineering, Australian National University, Canberra, Australia,
- http://dx.doi.org/10.1002/cpe.549
- light wiley document
- wiley 2002 full documents
- Abstract:
Generalized matrix-matrix multiplication forms the kernel of many mathematical algorithms, hence a faster matrix-matrix multiply immediately benefits these algorithms. In this paper we implement efficient matrix multiplication for large matrices using the Intel Pentium single instruction multiple data (SIMD) floating point architecture. The main difficulty with the Pentium and other commodity processors is the need to efficiently utilize the cache hierarchy, particularly given the growing gap between main-memory and CPU clock speeds. We give a detailed description of the register allocation, Level 1 and Level 2 cache blocking strategies that yield the best performance for the Pentium III family. Our results demonstrate an average performance of 2.09 times faster than the leading public domain matrix-matrix multiply routines and comparable performance with Intel"s SIMD small matrix-matrix multiply routines. Copyright © 2001 John Wiley & Sons, Ltd.
CPE550:Experiences from integrating algorithmic and systemic load balancing strategies
- Author(s):Ioana Banicescu,Sheikh Ghafoor,Vijay Velusamy,Samuel H. Russ,Mark Bilderback
- Mississippi State University, Department of Computer Science, NSF Engineering Research Center for Computational Field Simulation, MS 39762, U.S.A.,
- http://dx.doi.org/10.1002/cpe.550
- light wiley document
- wiley 2002 full documents
- Abstract:
Load balancing increases the efficient use of existing resources for parallel and distributed applications. At a coarse level of granularity, advances in runtime systems for parallel programs have been proposed in order to control available resources as efficiently as possible by utilizing idle resources and using task migration. Simultaneously, at a finer granularity level, advances in algorithmic strategies for dynamically balancing computational loads by data redistribution have been proposed in order to respond to variations in processor performance during the execution of a given parallel application. Combining strategies from each level of granularity can result in a system which delivers advantages of both. The resulting integration is systemic in nature and transfers the responsibility of efficient resource utilization from the application programmer to the runtime system. This paper presents the design and implementation of a system that combines an algorithmic fine-grained data parallel load balancing strategy with a systemic coarse-grained task-parallel load balancing strategy, and reports on recent experimental results of running a computationally intensive scientific application under this integrated system. The experimental results indicate that a distributed runtime environment which combines both task and data migration can provide performance advantages with little overhead. It also presents proposals for performance enhancements of the implementation, as well as future explorations for effective resource management. Copyright © 2001 John Wiley & Sons, Ltd.
CPE551:Analysis and measurement of the effect of kernel locks in SMP systems
- Author(s):Akihiro Kaieda,Yasuichi Nakayama,Atsuhiro Tanaka,Takashi Horikawa,Toshiyasu Kurasugi,Issei Kino
- Department of Computer Science, The University of Electro-Communications, Chofu, Tokyo 182-8585, Japan,NEC Corporation, Kawasaki 216-8555, Japan,
- http://dx.doi.org/10.1002/cpe.551
- light wiley document
- wiley 2002 full documents
- Abstract:
This article reports the use of case studies to evaluate the performance degradation caused by the kernel-level lock. We define the lock ratio as a ratio of the execution time for critical sections to the total execution time of a parallel program. The kernel-level lock ratio determines how effective programs work on symmetric multiprocessor (SMP) systems. We have measured the lock ratios and the performance of three types of parallel programs on SMP systems with Linux 2.0: matrix multiplication, parallel make, and WWW server programs. Experimental results show that the higher the lock ratio of parallel programs, the worse their performance becomes. Copyright © 2001 John Wiley & Sons, Ltd.
CPE552:Parallel solvers for discrete-time algebric Riccati equations
- Author(s):Rafael Mayo,Enrique S. Quintana-Ortí,Gregorio Quintana-Ortí,Vicente Hernández
- Depto. de Informática, Universidad Jaume I, 12080-Castellón, Spain,Depto. de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46071-Valencia, Spain
http://dx.doi.org/10.1002/cpe.552
light wiley document
wiley 2002 full documents
Abstract:
We investigate the numerical solution of discrete-time algebraic Riccati equations on a parallel distributed architecture. Our solvers obtain an initial solution of the Riccati equation via the disc function method, and then refine this solution using Newton"s method. The Smith iteration is employed to solve the Stein equation that arises at each step of Newton"s method. The numerical experiments on an Intel Pentium-II cluster, connected via a Myrinet switch, report the performance and scalability of the new algorithms. Copyright © 2001 John Wiley & Sons, Ltd.
CPE554:A parallel Navier-Stokes solver for the rotating flow problem
- Author(s):Rudnei Dias da Cunha,Álvaro Luiz de Bortoli
- Postgraduate Programme on Applied Mathematics, Institute of Mathematics, Federal University of Rio Grande do Sul, Brazil,
- http://dx.doi.org/10.1002/cpe.554
- light wiley document
- wiley 2002 full documents
- Abstract:
In this paper, we investigate the parallel solution of rotating internal flow problems, using the Navier-Stokes equations as proposed by Speziale and Thangam (in 1983) and Speziale (in 1985). A Runge-Kutta time-stepping scheme was applied to the equations and both sequential and message-passing implementations were developed, the latter using MPI, and were tested on a four-processor SGI Origin200 distributed, global shared memory parallel computer and on a 32-processor IBM 9076 SP/2 distributed memory parallel computer. The results show that our approach to parallelize the sequential implementation requires little effort whilst providing good results even for medium-sized problems. Copyright © 2001 John Wiley & Sons, Ltd.
CPE563:Using knowledge-based systems for research on parallelizing compilers
- Author(s):Chao-Tung Yang,Shian-Shyong Tseng,Yun-Woei Fann,Ting-Ku Tsai,Ming-Huei Hsieh,Cheng-Tien Wu
- Ground System Section, National Space Program Office, 8F, No. 9 Prosperity 1st Road, Science-Based Industrial Park, Hsinchu, Taiwan 300, ROC,Department of Computer & Information Science, National Chiao Tung University, Hsinchu, Taiwan 300, ROC,
- http://dx.doi.org/10.1002/cpe.563
- light wiley document
- wiley 2002 full documents
- Abstract:
The main function of parallelizing compilers is to analyze sequential programs, in particular the loop structure, to detect hidden parallelism and automatically restructure sequential programs into parallel subtasks that are executed on a multiprocessor. This article describes the design and implementation of an efficient parallelizing compiler to parallelize loops and achieve high speedup rates on multiprocessor systems. It is well known that the execution efficiency of a loop can be enhanced if the loop is executed in parallel or partially parallel, such as in a DOALL or DOACROSS loop. This article also reviews a practical parallel loop detector (PPD) that is implemented in our PFPC on finding the parallelism in loops. The PPD can extract the potential DOALL and DOACROSS loops in a program by verifying array subscripts. In addition, a new model by using knowledge-based approach is proposed to exploit more loop parallelisms in this paper. The knowledge-based approach integrates existing loop transformations and loop scheduling algorithms to make good use of their ability to extract loop parallelisms. Two rule-based systems, called the KPLT and IPLS, are then developed using repertory grid analysis and attribute-ordering tables respectively, to construct the knowledge bases. These systems can choose an appropriate transform and loop schedule, and then apply the resulting methods to perform loop parallelization and obtain a high speedup rate. For example, the IPLS system can choose an appropriate loop schedule for running on multiprocessor systems. Finally, a runtime technique based on the inspector/executor scheme is proposed in this article for finding available parallelism on loops. Our inspector can determine the wavefronts of a loop with any complex indirected array-indexing pattern by building a DEF-USE table. The inspector is fully parallel without any synchronization. Experimental results show that the new method can resolve any complex data dependence patterns where no previous research can. One of the ultimate goals is to construct a high-performance and portable FORTRAN parallelizing compiler on shared-memory multiprocessors. We believe that our research may provide more insight into the development of a high-performance parallelizing compiler. Copyright © 2001 John Wiley & Sons, Ltd.
CPE564:Redistribution strategies for portable parallel FFT: a case study
- Author(s):Anshu Dubey,Daniele Tessera
- Astronomy & Astrophysics, University of Chicago, 5640 S. Ellis Ave., Chicago, IL 60637, U.S.A.,Dipartimento di Informatica e Sistemistica, Università degli Studi di Pavia, I-27100 Pavia, Italy
http://dx.doi.org/10.1002/cpe.564
light wiley document
wiley 2002 full documents
Abstract:
The best approach to parallelize multidimensional FFT algorithms has long been under debate. Distributed transposes are widely used, but they also vary in communication policies and hence performance. In this work we analyze the impact of different redistribution strategies on the performance of parallel FFT, on various machine architectures. We found that some redistribution strategies were consistently superior, while some others were unexpectedly inferior. An in-depth investigation into the reasons for this behavior is included in this work. Copyright © 2001 John Wiley & Sons, Ltd.
CPE565:A distributed implementation of a virtual machine for Java
- Author(s):Yariv Aridor,Michael Factor,Avi Teperman
- IBM Haifa Research Laboratory, Matam, Advanced Technology Center, Haifa 31905, Israel,
- http://dx.doi.org/10.1002/cpe.565
- light wiley document
- wiley 2002 full documents
- Abstract:
The cluster virtual machine (VM) for Java provides a single system image of a traditional Java Virtual Machine (JVM) while executing in a distributed fashion on the nodes of a cluster. The cluster VM for Java virtualizes the cluster, supporting any pure Java application without requiring that application be tailored specifically for it. The aim of our cluster VM is to obtain improved scalability for a class of Java Server Applications by distributing the application"s work among the cluster"s computing resources. The implementation of the cluster VM for Java is based on a novel object model which distinguishes between an application"s view of an object (e.g. every object is a unique data structure) and its implementation (e.g. objects may have consistent replications on different nodes). This enables us to exploit knowledge on the use of individual objects to improve performance (e.g. using object replications to increase locality of access to objects).
We have already completed a prototype that runs pure Java applications on a cluster of NT workstations connected by a Myrinet fast switch. The prototype provides a single system image to applications, distributing the application"s threads and objects over the cluster. We used the cluster VM to run, without change, a real Java Server Application containing over 10 Kloc1Kloc means Kilo lines of code-used to describe the size of applications in terms of source lines count. for the source code and achieved high scalability for it on a cluster. We also achieved linear speedup for another application with a large number of independent threads. This paper discusses the architecture and implementation of the cluster VM. It focuses on achieving a single system image for a traditional JVM on a cluster while describing, in short, how we aim to obtain scalability. Copyright © 2001 John Wiley & Sons, Ltd.
CPE556:Object-oriented analysis and design of the Message Passing Interface
- Author(s):Anthony Skjellum,Diane G. Wooley,Ziyang Lu,Michael Wolf,Purushotham V. Bangalore,Andrew Lumsdaine,Jeffrey M. Squyres,Brian McCandless
- Engineering Research Center and Department of Computer Science, Mississippi State University, Mississippi State, MS 39762, U.S.A.,Laboratory for Scientific Computing, Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, U.S.A.,
- http://dx.doi.org/10.1002/cpe.556
- light wiley document
- wiley 2002 full documents
- Abstract:
The major contribution of this paper is the application of modern analysis techniques to the important Message Passing Interface standard, work done in order to obtain information useful in designing both application programmer interfaces for object-oriented languages, and message passing systems. Recognition of ‘Design Patterns’ within MPI is an important discernment of this work. A further contribution is a comparative discussion of the design and evolution of three actual object-oriented designs for the Message Passing Interface ( MPI-1SF ) application programmer interface (API), two of which have influenced the standardization of C++ explicit parallel programming with MPI-2, and which strongly indicate the value of a priori object-oriented design and analysis of such APIs. Knowledge of design patterns is assumed herein.
Discussion provided here includes systems developed at Mississippi State University (MPI++), the University of Notre Dame (OOMPI), and the merger of these systems that results in a standard binding within the MPI-2 standard. Commentary concerning additional opportunities for further object-oriented analysis and design of message passing systems and APIs, such as MPI-2 and MPI/RT, are mentioned in conclusion.
Connection of modern software design and engineering principles to high performance computing programming approaches is a new and important further contribution of this work. Copyright © 2001 John Wiley & Sons, Ltd.
CPE570:Strong types for coordinating active objects
- Author(s):Franz Puntigam
- Technische Universität Wien, Institut für Computersprachen, Argentinierstraße 8, A-1040 Vienna, Austria
http://dx.doi.org/10.1002/cpe.570
light wiley document
wiley 2002 full documents
Abstract:
An object type is usually regarded as a contract between an object and each of its clients. However, in concurrent (and sometimes also in sequential) systems it is more useful to regard a type as a contract between an object and the set of all clients: when the object acts as a shared resource, the clients must be coordinated before sending messages to the object. Process types of active objects specify constraints on the ordering of messages. Static type checking ensures proper coordination of clients so that objects receive messages only in acceptable orderings. As presented in this article, the process type system is based on a simple computation model where active objects communicate via asynchronous message passing. Copyright © 2001 John Wiley & Sons, Ltd.
CPE555:Algorithmic support for model transformation in object-oriented software development
- Author(s):Siegfried Schönberger,Rudolf K. Keller,Ismail Khriss
- Département d"informatique et de recherche opérationelle, Université de Montréal, C.P. 6128, succursale Centre-ville, Montréal, Québec H3C 3JZ, Canada,
- http://dx.doi.org/10.1002/cpe.555
- light wiley document
- wiley 2002 full documents
- Abstract:
Current methods for object-oriented software development provide notation for the specification of models, yet do not sufficiently relate the different model types to each other, nor do they provide support for transformations from one model type to another. This makes transformations a manual activity, which increases the risk of inconsistencies among models and may lead to a loss of information. We have developed and implemented an algorithm supporting one of the transitions from analysis to design, the transformation of scenario models into behavior models. This algorithm supports the Unified Modelling Language (UML), mapping the UML"s collaboration diagrams into state transition diagrams. We believe that CASE tools implementing such algorithms will be highly beneficial in object-oriented software development. In this paper, we provide an overview of our algorithm and discuss all its major steps. The algorithm is detailed in semi-formal English and illustrated with a number of examples. Furthermore, the algorithm is assessed from different perspectives, such as scope and role in the overall development process, issues in the design of the algorithm, complexity, implementation and experimentation, and related work. Copyright © 2001 John Wiley & Sons, Ltd.
CPE569:A new parallel domain decomposition method for the adaptive finite element solution of elliptic partial differential equations
- Author(s):Randolph E. Bank,Peter K. Jimack
- Department of Mathematics, University of California at San Diego, La Jolla, CA 92093, U.S.A.,School of Computing, University of Leeds, Leeds LS2 9JT, U.K.
http://dx.doi.org/10.1002/cpe.569
light wiley document
wiley 2002 full documents
Abstract:
We present a new domain decomposition algorithm for the parallel finite element solution of elliptic partial differential equations. As with most parallel domain decomposition methods each processor is assigned one or more subdomains and an iteration is devised which allows the processors to solve their own subproblem(s) concurrently. The novel feature of this algorithm however is that each of these subproblems is defined over the entire domain-although the vast majority of the degrees of freedom for each subproblem are associated with a single subdomain (owned by the corresponding processor). This ensures that a global mechanism is contained within each of the subproblems tackled and so no separate coarse grid solve is required in order to achieve rapid convergence of the overall iteration. Furthermore, by following the paradigm introduced in 15, it is demonstrated that this domain decomposition solver may be coupled easily with a conventional mesh refinement code, thus allowing the accuracy, reliability and efficiency of mesh adaptivity to be utilized in a well load-balanced manner. Finally, numerical evidence is presented which suggests that this technique has significant potential, both in terms of the rapid convergence properties and the efficiency of the parallel implementation. Copyright © 2001 John Wiley & Sons, Ltd.
CPE571:Modeling and testing object-oriented distributed systems with linear-time temporal logic
- Author(s):F. Dietrich,X. Logean,J.-P. Hubaux
- Institute for Computer Communications and Applications (ICA), Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland,
- http://dx.doi.org/10.1002/cpe.571
- light wiley document
- wiley 2002 full documents
- Abstract:
We present a framework for constructing formal models of object-oriented distributed systems and a property language to express behavioral constraints in such models. Most of the existing models have their origin in specific mathematical notations and/or concepts. In contrast, we have developed our model such that it accounts for a large set of phenomena associated with industrial implementations of object-oriented distributed systems. The model that we propose, while closer to industrial concerns and practice, still has the powerful features of formal approaches. It also offers the possibility to automatically check at service run-time that the final service implementation has not violated and is not violating properties expressed at the abstraction level of our model. In our model, which relies on event-based behavioral abstraction, we use linear-time temporal logic as the underlying formalism for the specification of properties. We introduce two novel operators which are especially useful for object-oriented systems and which provide a number of advantages over the well-known temporal logic operators. A recent decision of one of our industrial partners to adopt our proposal into one of their development platforms can be seen as a strong evidence of the relevance of our work and as a promising step towards a better understanding between the academic formal methods community and industry. Copyright © 2001 John Wiley & Sons, Ltd.
CPE583:Feature-oriented programming: A new way of object compositionA preliminary version of this article appeared under the title ‘Feature-Oriented Programming: A Fresh Look at Objects’ at ECOOP 1997.
- Author(s):Christian Prehofer
- TU München, 80290 Munich, Germany
http://dx.doi.org/10.1002/cpe.583
light wiley document
wiley 2002 full documents
Abstract:
We propose a new model for flexible composition of objects from a set of features. Features are services of an object and are similar to classes in object-oriented languages. In many cases, features have to be adapted in the presence of other features, which is also called the feature interaction problem. We introduce explicit interaction handlers which can adapt features to other features by overriding methods. When features are composed, the appropriate interaction handling is added in a way which generalizes inheritance and aggregation. For a set of features, an exponential number of different feature combinations is possible, based on a quadratic number of interaction resolutions. We present the feature model as an extension of Java and give two translations to Java, one via inheritance and the other via aggregation. We show that the feature model interacts nicely with several common language extensions such as type parameters, exceptions, and higher-order functions. Copyright © 2001 John Wiley & Sons, Ltd.
CPE584:Effective multicast programming in large scale distributed systems
- Author(s):Patrick Th. Eugster,Romain Boichat,Rachid Guerraoui,Joe Sventek
- Swiss Federal Institute of Technology, Lausanne,Agilent Laboratories Scotland, Edinburgh, U.K.
http://dx.doi.org/10.1002/cpe.584
light wiley document
wiley 2002 full documents
Abstract:
Many distributed applications have a strong requirement for efficient dissemination of large amounts of information to widely spread consumers in large networks. These include applications in e-commerce and telecommunication. Publish/subscribe is considered one of the most important interaction styles with which to model communication on a large scale. Producers publish information on a topic and consumers subscribe to the topics they wish to be informed of. The decoupling of producers and consumers in time, space, and flow makes the publish/subscribe paradigm very attractive for large scale distribution, especially in environments like the Internet.
This paper describes the architecture and implementation of DACE (Distributed Asynchronous Computing Environment), a framework for publish/subscribe communication based on an object-oriented programming abstraction in the form of Distributed Asynchronous Collection (DAC). DACs capture the variants of publish/subscribe, without blurring their respective advantages. The architecture we present is tolerant of network partitions and crash failures. The underlying model is based on the notion of Topic Membership: a weak membership for the parties involved in a topic. We present how Topic Membership enables the realization of a robust and efficient reliable multicast on a large scale. The protocol ensures that, inside a topic, even a subscriber who is temporarily partitioned away eventually receives a published message. Copyright © 2001 John Wiley & Sons, Ltd.
CPE585:Implementing a dynamic processor allocation policy for multiprogrammed parallel applications in the SolarisTM
- Author(s):Kelvin K. Yue,David J. Lilja
- Sun Microsystems Inc., 901 San Antonio Road, MS MPK17-302, Palo Alto, CA 94303, U.S.A.,Department of Electrical and Computer Engineering, Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, U.S.A.
http://dx.doi.org/10.1002/cpe.585
light wiley document
wiley 2002 full documents
Abstract:
Parallel applications typically do not perform well in a multiprogrammed environment that uses time-sharing to allocate processor resources to the applications" parallel threads. Co-scheduling related parallel threads, or statically partitioning the system, often can reduce the applications" execution times, but at the expense of reducing the overall system utilization. To address this problem, there has been increasing interest in dynamically allocating processors to applications based on their resource demands and the dynamically varying system load. The Loop-Level Process Control (LLPC) policy (Yue K, Lilja D. Efficient execution of parallel applications in multiprogrammed multiprocessor systems. 10th International Parallel Processing Symposium, 1996; 448-456) dynamically adjusts the number of threads an application is allowed to execute based on the application"s available parallelism and the overall system load. This study demonstrates the feasibility of incorporating the LLPC strategy into an existing commercial operating system and parallelizing compiler and provides further evidence of the performance improvement that is possible using this dynamic allocation strategy. In this implementation, applications are automatically parallelized and enhanced with the appropriate LLPC hooks so that each application interacts with the modified version of the Solaris operating system. The parallelism of the applications are then dynamically adjusted automatically when they are executed in a multiprogrammed environment so that all applications obtain a fair share of the total processing resources. Copyright © 2001 John Wiley & Sons, Ltd.
CPE562:Access to SAP"s Business Framework from Java-based applications
- Author(s):M. Aleksy,A. Korthaus
- University of Mannheim, Lehrstuhl für Wirtschaftsinformatik 3, Schloß, D-68131 Mannheim, Germany,
- http://dx.doi.org/10.1002/cpe.562
- light wiley document
- wiley 2002 full documents
- Abstract:
As the leading vendor of enterprise business standard software, SAP has recognized the need to adapt their R/3 system to current trends in software development and to meet market needs for speed of development, flexibility, openness, and interoperability. In this paper, we first present SAP"s approach to object-oriented and component-based technology by describing the Business Framework, the concepts of Business Objects, BAPIs, and the Business Object Repository. On this basis, we then analyze current communication architectures and products enabling the interaction of external Java- based software applications with SAP R/3, point out the advantages and disadvantages of different solutions, and finally elaborate on potential strategies and steps for driving the evolution of SAP R/3 in order to further increase interoperability, openness, and flexibility. Copyright © 2001 John Wiley & Sons, Ltd.
CPE561:Experience with using CORBA to implement a file caching coordination system
- Author(s):A. Sim,H. Nordberg,L. M. Bernardo,A. Shoshani,D. Rotem
- National Energy Research Scientific Computing Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, U.S.A.,
- http://dx.doi.org/10.1002/cpe.561
- light wiley document
- wiley 2002 full documents
- Abstract:
We describe our experience with using several CORBA products to interconnect the software modules of a fairly complex system for file caching coordination from a tertiary storage. The application area that the system was designed for is High Energy and Nuclear Physics (HENP). In this application area, the volume of data reaches hundreds of terabytes per year and therefore it is impractical to store them on disk systems. Rather the data are organized into large files and stored on robotic tape systems that are managed by some Mass Storage System (MSS). The role of the Storage Access Coordination System (STACS) that we developed is to manage the caching of files from the MSS to a large disk cache that is shared by multiple HENP analysis programs. The system design involved multiple components developed by different people at different sites and the modules could potentially be distributed as well. In this paper we describe the architecture and implementation of STACS, emphasizing the inter-module communication requirements. We describe the use of CORBA interfaces between system components, and our experience with using multi-threaded CORBA and using hundreds of concurrent CORBA connections. STACS development was recently completed and is being incorporated in an operational environment that started to produce data in the summer of 2000 [1]. Copyright © 2001 John Wiley & Sons, Ltd.
CPE557:Special Issue: Distributed Objects and Applications "99
- Author(s):Zahir Tari,Robert Meersman
- RMIT University, Australia,Free University of Brussels, Belgium
http://dx.doi.org/10.1002/cpe.557
light wiley document
wiley 2002 full documents
Abstract:
No abstract
CPE558:Evaluating policies and mechanisms to support distributed real-time applications with CORBA
- Author(s):Carlos O"Ryan,Douglas C. Schmidt,Fred Kuhns,Marina Spivak,Jeff Parsons,Irfan Pyarali,David L. Levine
- Electrical & Computer Engineering Department, University of California, Irvine, CA 92697, U.S.A.,Department of Computer Science, Washington University, St. Louis, MO 63130, U.S.A.,
- http://dx.doi.org/10.1002/cpe.558
- light wiley document
- wiley 2002 full documents
- Abstract:
To be an effective platform for performance-sensitive real-time systems, commodity-off-the-shelf (COTS) distributed object computing (DOC) middleware must support application quality of service (QoS) requirements end-to-end. However, conventional COTS DOC middleware does not provide this support, which makes it unsuited for applications with stringent latency, determinism, and priority preservation requirements. It is essential, therefore, to develop standards-based, COTS DOC middleware that permits the specification, allocation, and enforcement of application QoS requirements end-to-end.
The real-time CORBA and messaging specifications in the CORBA 2.4 standard are important steps towards defining standards-based, COTS DOC middleware that can deliver end-to-end QoS support at multiple levels in distributed and embedded real-time systems. These specifications still lack sufficient detail, however, to portably configure and control processor, communication, and memory resources for applications with stringent QoS requirements.
This paper provides four contributions to research on real-time DOC middleware. First, we illustrate how the CORBA 2.4 real-time and messaging specifications provide a starting point to address the needs of an important class of applications with stringent real-time requirements. Second, we illustrate how the CORBA 2.4 specifications are not sufficient to solve all the issues within this application domain. Third, we describe how we have implemented portions of these specifications, as well as several enhancements, using TAO, which is our open-source real-time CORBA ORB. Finally, we evaluate the performance of TAO empirically to illustrate how its features address the QoS requirements for certain classes of real-time applications. Copyright © 2001 John Wiley & Sons, Ltd.
CPE559:The COIL data mediator definition language
- Author(s):Christian Och,Richard M. Osborne,John Todd,Roger King
- Database Research Laboratory, Department of Computer Science, University of Colorado at Boulder, Boulder, CO 80308 0430, U.S.A.,
- http://dx.doi.org/10.1002/cpe.559
- light wiley document
- wiley 2002 full documents
- Abstract:
Modern persistent applications typically run on top of numerous (often hundreds) of distributed, heterogeneous databases. Integrating data from various data sources in order to provide uniform, homogeneous, and consistent access to the data from the underlying data sources while preserving the integrity and autonomy of preexisting data sources is a huge problem, especially in evolving environments. The main issue that a global information sharing system has to address is heterogeneity on all levels of the participating data sources: the data might be stored in heterogeneous data sources, using different data models and access/middleware technologies. Another major type of heterogeneity that has to be addressed by a global information sharing system is due to differences in the semantics of the data from the data source participating in the integration process. In particular, the resolution of potential conflicts due to structural and semantic heterogeneity of the integrated data is an essential part of the integration process performed by a data integration environment.
The COIL data mediator definition language supports the definition of the integration process and its semantics on a high level of abstraction. The language provides powerful language components which can be used to resolve potential conflicts due to structural and semantic heterogeneity, and to define the integration process in general. The semantics of a specific integration process are defined in a COIL data mediator definition (COIL program). The COIL compiler uses those specifications expressed in the COIL language to generate a standard Java object which captures the semantics of the integration process defined in the COIL mediator definition. The generated Java object is called a COIL data mediator. In general, a COIL data mediator is a user-defined, customized object/component which can be used like any other normal object in the Java development environment. A COIL data mediator presents a well-defined interface which may be used to access and configure the mediator at runtime. Therefore it is possible to change the data mediator"s behavior and the semantics of the integration process in general at runtime. We present a detailed description of the COIL data mediator definition language and its language components, along with a detailed discussion of how COIL data mediators are used to overcome different heterogeneity problems. Copyright © 2001 John Wiley & Sons, Ltd.
CPE560:A multicast group communication protocol, engine, and bridge for CORBA
- Author(s):L. E. Moser,P. M. Melliar-Smith,P. Narasimhan,R. R. Koch,K. Berket
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, U.S.A.,
- http://dx.doi.org/10.1002/cpe.560
- light wiley document
- wiley 2002 full documents
- Abstract:
Multicast group communication is needed for fault-tolerant distributed systems and, in particular, for the new Fault Tolerant CORBA standard, to maintain strong replica consistency. However, different multicast group communication protocols are appropriate for different environments, which makes it difficult to define a single standard multicast protocol. In this paper, we present a multicast group communication engine and bridge for CORBA that allows multiple group communication protocols to be used concurrently. We also present the Fault Tolerant Multicast Protocol, a group communication protocol that allows different fault tolerance systems to interoperate.
The group communication engine and bridge places Lamport timestamps on messages, and multicasts messages to groups, using one or more group communication protocols. The group communication protocols reliably deliver the timestamped messages in timestamp order to the group communication engine, which integrates these streams of messages into a single stream for delivery in timestamp order. The Fault Tolerant Multicast Protocol operates over IP Multicast, and consists of the Reliable Multicast Protocol which provides reliable source-ordered message delivery, the Reliable Ordered Multicast Protocol which provides reliable totally ordered message delivery, and the Host Group Membership Protocol which provides host group membership services. Copyright © 2001 John Wiley & Sons, Ltd.
CPE572:A Java commodity grid kit
- Author(s):Gregor von Laszewski,Ian Foster,Jarek Gawor,Peter Lane
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A.,
- http://dx.doi.org/10.1002/cpe.572
- light wiley document
- wiley 2002 full documents
- Abstract:
Developing advanced applications for the emerging national-scale ‘Computational Grid’ infrastructures is still a difficult task. Although Grid services are available that assist the application developers in authentication, remote access to computers, resource management, and infrastructure discovery, they provide a challenge because these services may not be compatible with the commodity distributed-computing technologies and frameworks used previously.
The Commodity Grid project is working to overcome this difficulty by creating what we call Commodity Grid Toolkits (CoG Kits) that define mappings and interfaces between Grid and particular commodity frameworks. In this paper, we explain why CoG Kits are important, describe the design and implementation of a Java CoG Kit, and use examples to illustrate how CoG Kits can enable new approaches to application development based on the integrated use of commodity and Grid technologies. Copyright © 2001 John Wiley & Sons, Ltd.
CPE573:Strategies for the efficient exploitation of loop-level parallelism in JavaPreliminary versions of the material presented in this paper appeared in the proceedings of the ACM 2000 Java Grande Conference and The Second Workshop on Java for High Performance Computing (ICS"2000).
- Author(s):José Oliver,Jordi Guitart,Eduard Ayguadé,Nacho Navarro,Jordi Torres
- iSOCO, Intelligent Software Components, S.A.,Computer Architecture Department, Technical University of Catalunya, Barcelona, Spain,
- http://dx.doi.org/10.1002/cpe.573
- light wiley document
- wiley 2002 full documents
- Abstract:
This paper analyzes the overheads incurred in the exploitation of loop-level parallelism using Java Threads and proposes some code transformations that minimize them. The transformations avoid the intensive use of Java Threads and reduce the number of classes used to specify the parallelism in the application (which reduces the time for class loading). The use of such transformations results in promising performance gains that may encourage the use of Java for exploiting loop-level parallelism in the framework of OpenMP. On average, the execution time for our synthetic benchmarks is reduced by 50% from the simplest transformation when eight threads are used. The paper explores some possible enhancements to the Java threading API oriented towards improving the application-runtime interaction. Copyright © 2001 John Wiley & Sons, Ltd.
CPE574:Special Issue: ACM 2000 Java Grande Conference
- Author(s):Geoffrey Fox
- School of Computational Science and Information Technology, Florida State University
http://dx.doi.org/10.1002/cpe.574
light wiley document
wiley 2002 full documents
Abstract:
No abstract
CPE575:Interceptors for Java Remote Method Invocation
- Author(s):N. Narasimhan,L. E. Moser,P. M. Melliar-Smith
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, U.S.A.,
- http://dx.doi.org/10.1002/cpe.575
- light wiley document
- wiley 2002 full documents
- Abstract:
An interceptor is a software mechanism that provides the hooks that are needed to introduce additional code dynamically into the execution path of an application. By exploiting interceptors, developers can enhance and potentially modify the behavior of an application at runtime without having to revise or recompile the application code. We have identified three distinct interception points for the Java Remote Method Invocation (JavaRMI) model, at the proxy level, the transport level and the shared library level of the JavaRMI model. The interceptors implemented at these interception points employ the DynamicProxy API, the RMISocketFactory API, and a library mediation approach, respectively. Our interceptor implementations are novel in that they are transparent to the application, add nominal latency overheads and are easy to deploy, requiring only minimal modifications to the application. We describe how the interceptors can be exploited to introduce additional services (such as logging and profiling mechanisms) to the JavaRMI runtime. In particular, we describe the use of interceptors in the Aroma System to enhance the existing JavaRMI model with support for fault-tolerance through the consistent replication of JavaRMI objects. Copyright © 2001 John Wiley & Sons, Ltd.
CPE576:High-performance file I/O in Java: Existing approaches and bulk I/O extensions
- Author(s):Dan Bonachea,Phillip Dickens,Rajeev Thakur
- EECS Department, University of California at Berkeley, Berkeley, CA 94720, U.S.A.,Dept. of Computer Science, Illinois Institute of Technology, 10 West 31st Street, Chicago, IL 60616, U.S.A.,Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A.
http://dx.doi.org/10.1002/cpe.576
light wiley document
wiley 2002 full documents
Abstract:
There is a growing interest in using Java as the language for developing high-performance computing applications. To be successful in the high-performance computing domain, however, Java must not only be able to provide high computational performance, but also high-performance I/O. In this paper, we first examine several approaches that attempt to provide high-performance I/O in Java-many of which are not obvious at first glance-and evaluate their performance on two parallel machines, the IBM SP and the SGI Origin2000. We then propose extensions to the Java I/O library that address the deficiencies in the Java I/O API and improve performance dramatically. The extensions add bulk (array) I/O operations to Java, thereby removing much of the overhead currently associated with array I/O in Java. We have implemented the extensions in two ways: in a standard JVM using the Java Native Interface (JNI) and in a high-performance parallel dialect of Java called Titanium. We describe the two implementations and present performance results that demonstrate the benefits of the proposed extensions. Copyright © 2001 John Wiley & Sons, Ltd.
CPE577:DISCOVER: An environment for Web-based interaction and steering of high-performance scientific applicationsThe DISCOVER collaboratory can be accessed at http://www.discoverportal.org/
- Author(s):V. Mann,V. Matossian,R. Muralidhar,M. Parashar
- Department of Electrical and Computer Engineering, Rutgers University, 94 Brett Road, Piscataway, NJ 08854, U.S.A.,
- http://dx.doi.org/10.1002/cpe.577
- light wiley document
- wiley 2002 full documents
- Abstract:
This paper presents the design, implementation, and deployment of the DISCOVER Web-based computational collaboratory. Its primary goal is to bring large distributed simulations to the scientists"/engineers" desktop by providing collaborative Web-based portals for monitoring, interaction and control. DISCOVER supports a three-tier architecture composed of detachable thin-clients at the front-end, a network of interaction servers in the middle, and a control network of sensors, actuators, and interaction agents at the back-end. The interaction servers enable clients to connect and collaboratively interact with registered applications using a browser. The application control network enables sensors and actuators to be encapsulated within, and directly deployed with the computational objects. The application interaction gateway manages overall interaction. It uses Java Native Interface to create Java proxies that mirror computational objects and allow them to be directly accessed at the interaction server. Security and authentication are provided using customizable access control lists and SSL-based secure servers. Copyright © 2001 John Wiley & Sons, Ltd.
CPE578:HBench:Java: An application-specific benchmarking framework for Java Virtual Machines
- Author(s):Xiaolan Zhang,Margo Seltzer
- Division of Engineering and Applied Sciences, Harvard University, 33 Oxford Street, Cambridge, MA 02138, U.S.A.,
- http://dx.doi.org/10.1002/cpe.578
- light wiley document
- wiley 2002 full documents
- Abstract:
Java applications represent a broad class of programs, ranging from programs running on embedded products to high-performance server applications. Standard Java benchmarks ignore this fact and assume a fixed workload. When an actual application"s behavior differs from that included in a standard benchmark, the benchmark results are useless, if not misleading. In this paper, we present HBench:Java, an application-specific benchmarking framework, based on the concept that a system"s performance must be measured in the context of the application of interest. HBench:Java employs a methodology that uses vectors to characterize the application and the underlying Java Virtual Machine (JVM) and carefully combines the two vectors to form a single metric that reflects a specific application"s performance on a particular JVM such that the performance of multiple JVMs can be realistically compared. Our performance results demonstrate HBench:Java"s superiority over traditional benchmarking approaches in predicting relative performance of real applications and its ability to pinpoint performance problems, even with a simplified vector. Copyright © 2001 John Wiley & Sons, Ltd.
CPE579:An OpenMP-like interface for parallel programming in Java
- Author(s):M. E. Kambites,J. Obdržálek,J. M. Bull
- Department of Mathematics, University of York, Heslington, York YO10 5DD, England, U.K.,Faculty of Informatics, Masaryk University, Botanická 68a, 602 00 Brno, Czech Republic,Edinburgh Parallel Computing Centre, University of Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, Scotland, U.K.
http://dx.doi.org/10.1002/cpe.579
light wiley document
wiley 2002 full documents
Abstract:
This paper describes the definition and implementation of an OpenMP-like set of directives and library routines for shared memory parallel programming in Java. A specification of the directives and routines is proposed and discussed. A prototype implementation, consisting of a compiler and a runtime library, both written entirely in Java, is presented, which implements most of the proposed specification. Some preliminary performance results are reported. Copyright © 2001 John Wiley & Sons, Ltd.
CPE580:Using JavaNws to compare C and Java TCP-Socket performance
- Author(s):Chandra Krintz,Rich Wolski
- Department of Computer Science and Engineering, University of California, San Diego, U.S.A.,Computer Science Department, University of Tennessee, Knoxville, U.S.A.
http://dx.doi.org/10.1002/cpe.580
light wiley document
wiley 2002 full documents
Abstract:
As research and implementation continue to facilitate high-performance computing in Java, applications can benefit from resource management and prediction tools. In this work, we present such a tool for network round-trip time and bandwidth between a user"s desktop and any machine running a Web server (this assumes that the user"s browser is capable of supporting Java 1.1 and above). JavaNws is a Java implementation and extension of a powerful subset of the Network Weather Service (NWS), a performance prediction toolkit that dynamically characterizes and forecasts the performance available to an application. However, due to the Java language implementation and functionality (portability, security, etc.), it is unclear whether a Java program is able to measure and predict the network performance experienced by C-applications with the same accuracy as an equivalent C program. We provide a quantitative equivalence study of the Java and C TCP-socket interface and show that the data collected by the JavaNws is as predictable as that collected by the NWS (using C). Copyright © 2001 John Wiley & Sons, Ltd.
CPE581:Parallel application experience with replicated method invocation
- Author(s):Jason Maassen,Thilo Kielmann,Henri E. Bal
- Division of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands,
- http://dx.doi.org/10.1002/cpe.581
- light wiley document
- wiley 2002 full documents
- Abstract:
We describe and evaluate a new approach to object replication in Java, aimed at improving the performance of parallel programs. Our programming model allows the programmer to define groups of objects that can be replicated and updated as a whole, using reliable, totally-ordered broadcast to send update methods to all machines containing a copy. The model has been implemented in the Manta highperformance Java system. We evaluate system performance both with microbenchmarks and with a set of five parallel applications. For the applications, we also evaluate ease of programming, compared to RMI implementations. We present performance results for a Myrinet-based workstation cluster as well as for a wide-area distributed system consisting of four such clusters. The microbenchmarks show that updating a replicated object on 64 machines only takes about three times the RMI latency in Manta. Applications using Manta"s object replication mechanism perform at least as fast as manually optimized versions based on RMI, while keeping the application code as simple as with naive versions that use shared objects without taking locality into account. Using a replication mechanism in Manta"s runtime system enables several unmodified applications to run efficiently even on the wide-area system. Copyright © 2001 John Wiley & Sons, Ltd.
CPE586:Development and performance analysis of real-world applications for distributed and parallel architectures
- Author(s):T. Fahringer,P. Blaha,A. Hössinger,J. Luitz,E. Mehofer,H. Moritsch,B. Scholz
- Institute for Software Science, University of Vienna, Liechtensteinstrasse 22, A-1090 Vienna, Austria,Institute of Physical and Theoretical Chemistry, Vienna University of Technology, Getreidemarkt 9/156, A-1060 Vienna, Austria,Institute for Microelectronics, Vienna University of Technology, Gusshausstr. 27-29/E 360, A-1040 Vienna, Austria,Department of Business Administration, University of Vienna, Brünner Strasse 72, A-1210 Vienna, Austria,
- http://dx.doi.org/10.1002/cpe.586
- light wiley document
- wiley 2002 full documents
- Abstract:
Several large real-world applications have been developed for distributed and parallel architectures. We examine two different program development approaches. First, the usage of a high-level programming paradigm which reduces the time to create a parallel program dramatically but sometimes at the cost of a reduced performance; a source-to-source compiler, has been employed to automatically compile programs-written in a high-level programming paradigm-into message passing codes. Second, a manual program development by using a low-level programming paradigm-such as message passing-enables the programmer to fully exploit a given architecture at the cost of a time-consuming and error-prone effort. Performance tools play a central role in supporting the performance-oriented development of applications for distributed and parallel architectures. SCALA-a portable instrumentation, measurement, and post-execution performance analysis system for distributed and parallel programs-has been used to analyze and to guide the application development, by selectively instrumenting and measuring the code versions, by comparing performance information of several program executions, by computing a variety of important performance metrics, by detecting performance bottlenecks, and by relating performance information back to the input program. We show several experiments of SCALA when applied to real-world applications. These experiments are conducted for a NEC Cenju-4 distributed-memory machine and a cluster of heterogeneous workstations and networks. Copyright © 2001 John Wiley & Sons, Ltd.
CPE587:Simulating asynchronous hardware on multiprocessor platforms: the case of AMULET1
- Author(s):Georgios K. Theodoropoulos
- School of Computer Science, The University of Birmingham, Birmingham B15 2TT, U.K.
http://dx.doi.org/10.1002/cpe.587
light wiley document
wiley 2002 full documents
Abstract:
Synchronous VLSI design is approaching a critical point, with clock distribution becoming an increasingly costly and complicated issue and power consumption rapidly emerging as a major concern. Hence, recently, there has been a resurgence of interest in asynchronous digital design techniques which promise to liberate digital design from the inherent problems of synchronous systems. This activity has revealed a need for modelling and simulation techniques suitable for the asynchronous design style. The concurrent process algebra communicating sequential processes (CSP) and its executable counterpart, occam, are increasingly advocated as particularly suitable for this purpose. This paper focuses on issues related to the execution of CSP/occam models of asynchronous hardware on multiprocessor machines, including I/O, monitoring and debugging, partition, mapping and load balancing. These issues are addressed in the context of occarm, an occam simulation model of the AMULET1 asynchronous microprocessor; however, the solutions devised are more general and may be applied to other systems too. Copyright © 2001 John Wiley & Sons, Ltd.
CPE588:Scalability and performance of OpenMP and MPI on a 128-processor SGI Origin 2000
- Author(s):Glenn R. Luecke,Wei-Hua Lin
- 291 Durham Center, Iowa State University, Ames, Iowa 50011-2251, U.S.A.,
- http://dx.doi.org/10.1002/cpe.588
- light wiley document
- wiley 2002 full documents
- Abstract:
The purpose of this paper is to investigate the scalability and performance of seven, simple OpenMP test programs and to compare their performance with equivalent MPI programs on an SGI Origin 2000. Data distribution directives were used to make sure that the OpenMP implementation had the same data distribution as the MPI implementation. For the matrix-times-vector (test 5) and the matrix-times-matrix (test 7) tests, the syntax allowed in OpenMP 1.1 does not allow OpenMP compilers to be able to generate efficient code since the reduction clause is not currently allowed for arrays. (This problem is corrected in OpenMP 2.0.) For the remaining five tests, the OpenMP version performed and scaled significantly better than the corresponding MPI implementation, except for the right shift test (test 2) for a small message. Copyright © 2001 John Wiley & Sons, Ltd.
CPE606:Special Issue: Object-oriented Databases
CPE607:Distributed data integration by object-oriented mediator servers
- Author(s):Tore Risch,Vanja Josifovski
- Department of Information Science, Uppsala University, P.O. Box 513, SE-751 20 Uppsala, Sweden,
- http://dx.doi.org/10.1002/cpe.607
- light wiley document
- wiley 2002 full documents
- Abstract:
Integration of data from autonomous, distributed and heterogeneous data sources poses several technical challenges. This paper overviews the data integration system AMOS II based on the wrapper-mediator approach. AMOS II consists of: (i) a mediator database engine that can process and execute queries over data stored locally and in several external data sources, and (ii) object-oriented (OO) multi-database views for reconciliation of data and schema heterogeneities among sources with various capabilities. The data stored in different types of data sources is translated and integrated using OO mediation primitives, providing the user with a consistent view of the data in all the sources. Through its multi-database facilities many distributed AMOS II systems can interoperate in a federation. Since most data reside in the data sources, and to achieve high performance, the core of the system is a main-memory DBMS having a storage manager, query optimizer, transactions, client-server interface, disk backup, etc. The AMOS II data manager is optimized for main-memory access and is extensible so that new data types and query operators can be added or implemented in some external programming language. The extensibility is essential for providing seamlessaccess to a variety of data sources. Copyright © 2001 John Wiley & Sons, Ltd.
CPE608:Consistency management in object-oriented databases
- Author(s):H. Oakasha,S. Conrad,G. Saake
- Cairo University, Fayoum Branch, Faculty of Science, Fayoum, Egypt,Universität München, Institut für Informatik, Oettingenstrasse 67, 80538 München, Germany,Universität Magdeburg, Fakultät für Informatik, Postfach 4120, 39016 Magdeburg, Germany
http://dx.doi.org/10.1002/cpe.608
light wiley document
wiley 2002 full documents
Abstract:
The paper presents concepts and ideas underlying an approach for consistency management in object-oriented (OO) databases. In this approach constraints are considered as first class citizens and stored in a meta-database called constraints catalog. When an object is created constraints of this object are retrieved from the constraints catalog and relationships between these constraints and the object are established. The structure of constraints has several features that enhance consistency management in OO database management systems which do not exist in conventional approaches in a satisfactory way. This includes: monitoring object consistency at different levels of update granularity, integrity independence, and efficiency of constraints maintenance; controlling inconsistent objects; enabling and disabling constraints, globally to all objects or locally to individual objects; and declaring constraints on individual objects. All these features are provided by means of basic notations of OO data models. Copyright © 2001 John Wiley & Sons, Ltd.
CPE609:A temporal object-oriented model for digital libraries of documents
- Author(s):M. J. Aramburu Cabo,R. Berlanga Llavori
- Departamento de Informática, Universitat Jaume I, Castellón, Spain,
- http://dx.doi.org/10.1002/cpe.609
- light wiley document
- wiley 2002 full documents
- Abstract:
This paper presents a model for the representation and retrieval of structured documents considering their temporal properties. The purpose of this model is to serve as a platform for the development of digital library applications. Thus, it consists of both a new data model and a query language that are specially adapted to the requirements of these applications. The main elements of the data model are a flexible type system for structured documents, and two temporal dimensions that represent the temporal properties of documents and the evolution of the database schema. As for its query language, it allows the retrieval of documents by specifying conditions on their structure, contents and temporal properties. This query language has been designed for exploiting the temporal information stored into a large digital library, making possible to relate document contents in time, as well as to analyse the evolution of topics. The paper also includes some guidelines for the efficient implementation of databases of structured documents by adopting the proposed data and query models. Copyright © 2001 John Wiley & Sons, Ltd.
CPE610:Compensation methods to support cooperative applications: A case study in automated verification of schema requirements for an advanced transaction model
- Author(s):David Spelt,Susan Even
- Centre for Telematics and Information Technology, University of Twente, Enschede, The Netherlands,
- http://dx.doi.org/10.1002/cpe.610
- light wiley document
- wiley 2002 full documents
- Abstract:
Compensation plays an important role in advanced transaction models, cooperative work and workflow systems. A schema designer is typically required to supply for each transaction $T$ another transaction $T^{-1}$ to semantically undo the effects of $T$. Little attention has been paid to the verification of the desirable properties of such operations, however. This paper demonstrates the use of a higher-order logic theorem prover for verifying that compensating transactions return a database to its original state. It is shown how an OODB schema is translated to the language of the theorem prover so that proofs can be performed on the compensating transactions. Copyright © 2001 John Wiley & Sons, Ltd.
CPE589:PoLAPACK: parallel factorization routines with algorithmic blocking
- Author(s):Jaeyoung Choi
- School of Computing, Soongsil University, 1-1, Sangdo-Dong, Dongjak-Ku, Seoul 156-743, Korea
http://dx.doi.org/10.1002/cpe.589
light wiley document
wiley 2002 full documents
Abstract:
LU, QR, and Cholesky factorizations are the most widely used methods for solving dense linear systems of equations, and have been extensively studied and implemented on vector and parallel computers. Most of these factorization routines are implemented with block-partitioned algorithms in order to perform matrix-matrix operations, that is, to obtain the highest performance by maximizing reuse of data in the upper levels of memory, such as cache. Since parallel computers have different performance ratios of computation and communication, the optimal computational block sizes are different from one another in order to generate the maximum performance of an algorithm. Therefore, the ata matrix should be distributed with the machine specific optimal block size before the computation. Too small or large a block size makes achieving good performance on a machine nearly impossible. In such a case, getting a better performance may require a complete redistribution of the data matrix.
In this paper, we present parallel LU, QR, and Cholesky factorization routines with an ‘algorithmic blocking’ on two-dimensional block cyclic data distribution. With the algorithmic blocking, it is possible to obtain the near optimal performance irrespective of the physical block size. The routines are implemented on the Intel Paragon and the SGI/Cray T3E and compared with the corresponding ScaLAPACK factorization routines. Copyright © 2001 John Wiley & Sons, Ltd.
CPE590:Parallel versions of Stone"s strongly implicit algorithm
- Author(s):J. S. Reeve,A. D. Scurr,J. H. Merlin
- Department of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.,
- http://dx.doi.org/10.1002/cpe.590
- light wiley document
- wiley 2002 full documents
- Abstract:
In this paper, we describe various methods of deriving a parallel version of Stone"s Strongly Implicit Procedure (SIP) for solving sparse linear equations arising from finite difference approximation to partial differential equations (PDEs). Sequential versions of this algorithm have been very successful in solving semi-conductor, heat conduction and flow simulation problems and an efficient parallel version would enable much larger simulations to be run. An initial investigation of various parallelizing strategies was undertaken using a version of high performance Fortran (HPF) and the best methods were reprogrammed using the MPI message passing libraries for increased efficiency. Early attempts concentrated on developing a parallel version of the characteristic wavefront computation pattern of the existing sequential SIP code. However, a red-black ordering of grid points, similar to that used in parallel versions of the Gauss-Seidel algorithm, is shown to be far more efficient. The results of both the wavefront and red-black MPI based algorithms are reported for various size problems and number of processors on a sixteen node IBM SP2. Copyright © 2001 John Wiley & Sons, Ltd.
CPE591:Real-time multi-spectral image fusion
- Author(s):Tiranee Achalakul,Stephen Taylor
- 5568 Airport Road, Roanoke, VA 24012, U.S.A.,2-106 CST building, Syracuse University, Syracuse, NY 13244, U.S.A.
http://dx.doi.org/10.1002/cpe.591
light wiley document
wiley 2002 full documents
Abstract:
This paper describes a novel real-time multi-spectral imaging capability for surveillance applications. The capability combines a new high-performance multi-spectral camera system with a distributed algorithm that computes a spectral-screening principal component transform (PCT). The camera system uses a novel filter wheel design together with a high-bandwidth CCD camera to allow image cubes to be delivered at 110 frames $^{-1}$s with a spectral coverage between 400 and 1000 nm. The filters used in a particular application are selected to highlight a particular object based on its spectral signature. The distributed algorithm allows image streams from a dispersed collection of cameras to be disseminated, viewed, and interpreted by a distributed group of analysts in real-time. It operates on networks of commercial-off-the-shelf multiprocessors connected with high-performance (e.g. gigabit) networking, taking advantage of multi-threading where appropriate. The algorithm uses a concurrent formulation of the PCT to de-correlate and compress a multi-spectral image cube. Spectral screening is used to give features that occur infrequently (e.g. mechanized vehicles in a forest) equal importance to those that occur frequently (e.g. trees in the forest). A human-centered color-mapping scheme is used to maximize the impact of spectral contrast on the human visual system.
To demonstrate the efficacy of the multi-spectral system, plant-life scenes with both real and artificial foliage are used. These scenes demonstrate the systems ability to distinguish elements of a scene that cannot be distinguished with the naked eye. The capability is evaluated in terms of visual performance, scalability, and real-time throughput. Our previous work on predictive analytical modeling is extended to answer practical design questions such as ‘For a specified cost, what system can be constructed and what performance will it attain?’ Copyright © 2001 John Wiley & Sons, Ltd.
CPE592:Space-time tradeoffs for parallel 3D reconstruction algorithms for virus-structure determination
- Author(s):Dan C. Marinescu,Yongchang Ji,Robert E. Lynch
- Department of Computer Sciences, Purdue University, West Lafayette, IN 47907, U.S.A.,
- http://dx.doi.org/10.1002/cpe.592
- light wiley document
- wiley 2002 full documents
- Abstract:
The 3D electron-density determination of viruses, from experimental data provided by electron microscopy, is a data-intensive computation that requires the use of clusters of PCs or parallel computers. In this paper we report on three parallel algorithms used for 3D reconstruction of asymmetric objects from their 2D projections. We discuss their computational, communication, I/O, and space requirements and present some performance data. The algorithms are general and can be used for 3D reconstruction of asymmetric objects for applications other than structural biology. Copyright © 2001 John Wiley & Sons, Ltd.
CPE593:Lesser Bear-a light-weight process library for SMP computers
- Author(s):Hisashi Oguma,Yasuichi Nakayama
- Department of Computer Science, The University of Electro-Communications, Chofu, Tokyo 182-8585, Japan,
- http://dx.doi.org/10.1002/cpe.593
- light wiley document
- wiley 2002 full documents
- Abstract:
We have designed and implemented a light-weight process (thread) library called ‘Lesser Bear’ for SMP computers. Lesser Bear has high portability and thread-level parallelism. Creating UNIX processes as virtual processors and a memory-mapped file as a huge shared-memory space enables Lesser Bear to execute threads in parallel. Lesser Bear requires exclusive operation between peer virtual processors, and treats a shared-memory space as a critical section for synchronization of threads. Therefore, thread functions of the previous Lesser Bear are serialized. In this paper, we present a scheduling mechanism to execute thread functions in parallel. In the design of the proposed mechanism, we divide the entire shared-memory space into partial spaces for virtual processors, and prepare two queues (Protect Queue and Waiver Queue) for each partial space. We adopt an algorithm in which lock operations are not necessary for enqueueing. This algorithm allows us to propose a scheduling mechanism that can reduce the scheduling overhead. The mechanism is applied to Lesser Bear and evaluated by experimental results. Copyright © 2001 John Wiley & Sons, Ltd.
CPE595:Special issue: formal techniques for Java programs
- Author(s):Susan Eisenbach,Gary T. Leavens
- Imperial College, London,Iowa State University
http://dx.doi.org/10.1002/cpe.595
light wiley document
wiley 2002 full documents
Abstract:
No abstract
CPE596:Type safety in the JVM: some problems in Java 2 SDK 1.2 and proposed solutions
- Author(s):Alessandro Coglio,Allen Goldberg
- Kestrel Institute, 3260 Hillview Avenue, Palo Alto, CA 94304, U.S.A.,Shoulders Corporation, 800 El Camino Real, Mountain View, CA 94040, U.S.A.
http://dx.doi.org/10.1002/cpe.596
light wiley document
wiley 2002 full documents
Abstract:
In the course of our work in developing formal specifications for components of the Java Virtual Machine (JVM), we have uncovered subtle bugs in the bytecode verifier of Sun"s Java 2 SDK 1.2. These bugs, which lead to type safety violations, relate to the naming of reference types. Under certain circumstances, these names can be spoofed through delegating class loaders. These flaws expose some inaccuracies and ambiguities in the JVM specification.
We propose several solutions to all of these bugs. In particular, we propose a general solution that makes use of subtype loading constraints. Such constraints complement the equality loading constraints introduced in the Java 2 Platform, and are posted by the bytecode verifier when checking assignment compatibility of class types. By posting constraints instead of resolving and loading classes, the bytecode verifier in our solution has a cleaner interface with the rest of the JVM, and allows lazier loading. We sketch some excerpts of our mathematical formalization of this approach and of its type safety results. Copyright © 2001 John Wiley & Sons, Ltd.
CPE597:Verified lightweight bytecode verification
- Author(s):Gerwin Klein,Tobias Nipkow
- Department of Informatics, Technische Universität München, 80290 Munich, Germany,
- http://dx.doi.org/10.1002/cpe.597
- light wiley document
- wiley 2002 full documents
- Abstract:
Eva and Kristoffer Rose proposed a (sparse) annotation of Java Virtual Machine code with types to enable a one-pass verification of well-typedness. We have formalized a variant of their proposal in the theorem prover Isabelle/HOL and proved soundness and completeness. Copyright © 2001 John Wiley & Sons, Ltd.
CPE598:Hoare logic for Java in Isabelle/HOL
- Author(s):David von Oheimb
- Institut für Informatik, Technische Universität München, D 80290 München, Germany
http://dx.doi.org/10.1002/cpe.598
light wiley document
wiley 2002 full documents
Abstract:
This article presents a Hoare-style calculus for a substantial subset of Java Card, which we call Java$^{ell ight}$. In particular, the language includes side-effecting expressions, mutual recursion, dynamic method binding, full exception handling, and static class initialization.
The Hoare logic of partial correctness is proved not only sound (w.r.t. our operational semantics of Java$^{ell ight}$, described in detail elsewhere) but even complete. It is the first logic for an object-oriented language that is provably complete. The completeness proof uses a refinement of the Most General Formula approach. The proof of soundness gives new insights into the role of type safety. Further by-products of this work are a new general methodology for handling side-effecting expressions and their results, the discovery of the strongest possible rule of consequence, and a flexible Call rule for mutual recursion. We also give a small but non-trivial application example.
All definitions and proofs have been done formally with the interactive theorem prover Isabelle/HOL. This guarantees not only rigorous definitions, but also gives maximal confidence in the results obtained. Copyright © 2001 John Wiley & Sons, Ltd.
CPE599:Java access protection through typing
- Author(s):Eva Rose,Kristoffer Høgsbro Rose
- GIE Dyade, INRIA-Rocquencourt, Domaine de Voluceau, Rocquencourt B.P.105, 78153 Le Chesnay, France,IBM T. J. Watson Research Center, 30 Saw Mill River Road, Hawthorne, NY 10532, U.S.A.
http://dx.doi.org/10.1002/cpe.599
light wiley document
wiley 2002 full documents
Abstract:
We propose an integration of field access rights into the Java type system such that those access permission checks which are now performed dynamically (at run time), can instead be done statically, i.e. checked by the Java compiler and rechecked (at link time) by the bytecode verifier. We explain how this can be extended to remove all dynamic checks of field read access rights, completely eliminating the overhead of get methods for reading the value of a field. Improvements include using fast static lookup instead of dynamic dispatch for field access (without requiring a sophisticated inlining analysis), the space required by get methods is avoided, and denial-of-service attacks on field access is prevented. We sketch a formalization of adding field access to the bytecode verifier which will make it possible to prove that the change is safe and backwards compatible. Copyright © 2001 John Wiley & Sons, Ltd.
CPE594:Open consensus
- Author(s):Romain Boichat,Svend Frølund,Rachid Guerraoui
- Swiss Federal Institute of Technology, Lausanne, Switzerland,Hewlett-Packard Laboratories, Palo Alto, CA, U.S.A.,
- http://dx.doi.org/10.1002/cpe.594
- light wiley document
- wiley 2002 full documents
- Abstract:
This paper presents the abstraction of open consensus and argues for its use as an effective component for building reliable agreement protocols in practical asynchronous systems where processes and links can crash and recover. The specification of open consensus has a decoupled, on-demand and re-entrant flavour that make its use very efficient, especially in terms of forced logs, which are known to be major sources of overhead in distributed systems. We illustrate the use of open consensus as a basic building block to develop a modular, yet efficient, total-order broadcast protocol. Finally, we describe our Java implementation of our open-consensus abstraction and we convey our efficiency claims through some practical performance measures. Copyright © 2001 John Wiley & Sons, Ltd.
CPE611:Deferring elimination of design alternatives in object-oriented methods
- Author(s):Mehmet Aksit,Francesco Marcelloni
- TRESE project, Department of Computer Science, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands,Dipartimento di Ingegneria della Informazione, Università di Pisa, Via Diotisalvi, 2-56126, Pisa, Italy
http://dx.doi.org/10.1002/cpe.611
light wiley document
wiley 2002 full documents
Abstract:
While developing systems, software engineers generally have to deal with a large number of design alternatives. Current object-oriented methods aim to eliminate design alternatives whenever they are generated. Alternatives, however, should be eliminated only when sufficient information to take such a decision is available. Otherwise, alternatives have to be preserved to allow further refinements along the development process. Too early elimination of alternatives results in loss of information and excessive restriction of the design space. This paper aims to enhance the current object-oriented methods by modeling and controlling the design alternatives through the application of fuzzy-logic-based techniques. By using an example method, it is shown that the proposed approach increases the adaptability and reusability of design models. The method has been implemented and tested in our experimental CASE environment. Copyright © 2001 John Wiley & Sons, Ltd.
CPE612:Parallel dynamic water supply scheduling in a cluster of computers
- Author(s):M. Damas,M. Salmerón,J. Ortega,G. Olivares,H. Pomares
- Department of Computer Architecture and Computer Technology, Facultad de Ciencias, University of Granada, Campus Fuentenueva s/n. E-18071, Granada, Spain,
- http://dx.doi.org/10.1002/cpe.612
- light wiley document
- wiley 2002 full documents
- Abstract:
The parallelization of complex planning and control problems arising in diverse application areas in the industrial, services and commercial environments not only allows the determination of control variables in the required times but also improves the performance of the control procedure as more processors are involved in the execution of the parallel program. In this paper we describe a scheduling application in a water supply network to demonstrate the benefits of parallel processing. The procedure we propose combines dynamic programming with genetic algorithms and time series prediction in order to solve problems in which decisions are made in stages, and the states and control belong to a continuous space. Taking into account the computational complexity of these applications and the time constraints that are usually imposed, the procedure has been implemented by a parallel program in a cluster of computers, an inexpensive and widely extended platform that can make parallelism a practical means of tackling complex problems in many different environments. Copyright © 2001 John Wiley & Sons, Ltd.
CPE613:Comparing Windows NT, Linux, and QNX as the basis for cluster systems
- Author(s):Avi Kavas,Dror G. Feitelson
- School of Computer Science and Engineering, The Hebrew University, 91904 Jerusalem, Israel,
- http://dx.doi.org/10.1002/cpe.613
- light wiley document
- wiley 2002 full documents
- Abstract:
Clusters use commodity hardware and software components to provide an environment for high-performance parallel processing. A major issue in the development of a cluster system is the choice of the operating system that will run on each node. We compare three alternatives: Windows NT, Linux, and QNX-a real-time microkernel. The comparison is based on expressive power, performance, and ease-of-use metrics. The result is that none of these systems has a clear advantage over the others in all the metrics, but that each has its strong and weak points. Thus any choice of a base system will involve some technical compromises, but not major ones. Copyright © 2001 John Wiley & Sons, Ltd.