Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Autonomic oil reservoir optimization on the Grid | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 1 | ||||||||
Date | 01 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.871 | ||||||||
Article ID | CPE871 | ||||||||
Author Name(s) | Manish Parashar1Vincent Matossian2Viraj Bhat3Magorzata Peszyska4Mrinal Sen5Paul Stoffa6Mary F. Wheeler7 | ||||||||
Author Email(s) | parashar@caip.rutgers.edu1 | ||||||||
Affiliation(s) | The Applied Software Systems Laboratory, Rutgers University, Piscataway, NJ 08855, U.S.A. 1 2 3 Department of Mathematics, Oregon State University, Corvallis, OR 97331, U.S.A. 4Institute for Geophysics, University of Texas at Austin, Austin, TX 78759, U.S.A. 5 6 Center for Subsurface Modeling, University of Texas at Austin, Austin, TX 78712, U.S.A. 7 | ||||||||
Keyword(s) | autonomic computing, oil reservoir optimization, peer-to-peer, Grid services, | ||||||||
Abstract |
|||||||||
The emerging Grid infrastructure and its support for seamless and secure interactions is enabling a new generation of autonomic applications where the application components, Grid services, resources, and data interact as peers to manage, adapt and optimize themselves and the overall application. In this paper we describe the design, development and operation of a prototype of such an application that uses peer-to-peer interactions between distributed services and data on the Grid to enable the autonomic optimization of an oil reservoir. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | A role for Pareto optimality in mining performance data | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 1 | ||||||||
Date | 01 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.877 | ||||||||
Article ID | CPE877 | ||||||||
Author Name(s) | Joël M. Malard1 | ||||||||
Author Email(s) | jm.malard@pnl.gov1 | ||||||||
Affiliation(s) | Pacific Northwest National Laboratory, Battelle Boulevard, P.O. Box 999, Richland, WA 99352, U.S.A. 1 | ||||||||
Keyword(s) | Pareto efficiency, dendogram, multiobjective optimization, software performance, hardware events, | ||||||||
Abstract |
|||||||||
Improvements in performance modeling and identification of computational regimes within software libraries is a critical first step in developing software libraries that are truly agile with respect to the application as well as to the hardware. It is shown here that Pareto ranking, a concept from multi-objective optimization, can be an effective tool for mining large performance datasets. The approach is illustrated using software performance data gathered using both the public domain LAPACK library and an asynchronous communication library based on IBM LAPI active message library. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Solving the block-Toeplitz least-squares problem in parallel | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 1 | ||||||||
Date | 01 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.883 | ||||||||
Article ID | CPE883 | ||||||||
Author Name(s) | P. Alonso1J. M. Badía2A. M. Vidal3 | ||||||||
Author Email(s) | palonso@dsic.upv.es1 | ||||||||
Affiliation(s) | Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46071 Valencia, Spain 1Departamento de Ingenierá y Ciencia de los Computadores, Universidad Jaume I, 12080 Castellón, Spain 2 3 | ||||||||
Keyword(s) | least-squares problem, block-Toeplitz matrices, Toeplitz-block matrices, displacement structure, Generalized Schur Algorithm, parallel algorithms, clusters of personal computers, | ||||||||
Abstract |
|||||||||
In this paper we present two versions of a parallel algorithm to solve the block-Toeplitz least-squares problem on distributed-memory architectures. We derive a parallel algorithm based on the seminormal equations arising from the triangular decomposition of the product TTT. Our parallel algorithm exploits the displacement structure of the Toeplitz-like matrices using the Generalized Schur Algorithm to obtain the solution in O(mn) flops instead of O(mn2) flops of the algorithms for non-structured matrices. The strong regularity of the previous product of matrices and an appropriate computation of the hyperbolic rotations improve the stability of the algorithms. We have reduced the communication cost of previous versions, and have also reduced the memory access cost by appropriately arranging the elements of the matrices. Furthermore, the second version of the algorithm has a very low spatial cost, because it does not store the triangular factor of the decomposition. The experimental results show a good scalability of the parallel algorithm on two different clusters of personal computers. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Performance evaluation of the SX-6 vector architecture for scientific computations | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 1 | ||||||||
Date | 01 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.884 | ||||||||
Article ID | CPE884 | ||||||||
Author Name(s) | Leonid Oliker1Andrew Canning2Jonathan Carter3John Shalf4David Skinner5Stéphane Ethier6Rupak Biswas7Jahed Djomehri8Rob Van der Wijngaart9 | ||||||||
Author Email(s) | rbiswas@nas.nasa.gov7 | ||||||||
Affiliation(s) | CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, U.S.A. 1 2 3 4 5 Princeton Plasma Physics Laboratory, Princeton University, Princeton, NJ 08453, U.S.A. 6NAS Division, NASA Ames Research Center, Moffett Field, CA 94035, U.S.A. 7 8 9 | ||||||||
Keyword(s) | microbenchmarks, NAS Parallel Benchmarks, scientific applications, vectorization, superscalar performance, | ||||||||
Abstract |
|||||||||
The growing gap between sustained and peak performance for scientific applications is a well-known problem in high-performance computing. The recent development of parallel vector systems offers the potential to reduce this gap for many computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX-6 vector processor, and compares it against the cache-based IBM Power3 and Power4 superscalar architectures, across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines many low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks. Finally, we evaluate the performance of several scientific computing codes. Overall results demonstrate that the SX-6 achieves high performance on a large fraction of our application suite and often significantly outperforms the cache-based architectures. However, certain classes of applications are not easily amenable to vectorization and would require extensive algorithm and implementation reengineering to utilize the SX-6 effectively. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Special Issue: Grid Performance | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 01 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.922 | ||||||||
Article ID | CPE922 | ||||||||
Author Name(s) | John Gurd1 | ||||||||
Author Email(s) | |||||||||
Affiliation(s) | 1 | ||||||||
Keyword(s) | ., | ||||||||
Abstract |
|||||||||
. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Performance engineering in data Grids | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.923 | ||||||||
Article ID | CPE923 | ||||||||
Author Name(s) | Erwin Laure1Heinz Stockinger2Kurt Stockinger3 | ||||||||
Author Email(s) | erwin.laure@cern.ch1 | ||||||||
Affiliation(s) | CERN, European Organization for Nuclear Research, Geneva, Switzerland 1 2 3 | ||||||||
Keyword(s) | data Grid, performance, scheduling, replica management, | ||||||||
Abstract |
|||||||||
The vision of Grid computing is to facilitate worldwide resource sharing among distributed collaborations. With the help of numerous national and international Grid projects, this vision is becoming reality and Grid systems are attracting an ever increasing user base. However, Grids are still quite complex software systems whose efficient use is a difficult and error-prone task. In this paper we present performance engineering techniques that aim to facilitate an efficient use of Grid systems, in particular systems that deal with the management of large-scale data sets in the tera- and petabyte range (also referred to as data Grids). These techniques are applicable at different layers of a Grid architecture and we discuss the tools required at each of these layers to implement them. Having discussed important performance engineering techniques, we investigate how major Grid projects deal with performance issues particularly related to data Grids and how they implement the techniques presented. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Performance of a possible Grid message infrastructure | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.924 | ||||||||
Article ID | CPE924 | ||||||||
Author Name(s) | Shrideep Pallickara1Geoffrey Fox2Ahmet Uyar3Hongbin Liu4Xi Rao5David Walker6Beytullah Yildiz7 | ||||||||
Author Email(s) | gcf@indiana.edu2 | ||||||||
Affiliation(s) | Community Grid Laboratory, Indiana University, Lindley Hall 215, Bloomington, IN 47405, U.S.A. 1 2 3 4 5 Department of Computer Science, Cardiff University, U.K. 6 7 | ||||||||
Keyword(s) | middleware, distributed messaging, P2P systems, Grid systems, audio/video conferencing, | ||||||||
Abstract |
|||||||||
In this paper we present the results pertaining to the NaradaBrokering middleware infrastructure. NaradaBrokering is designed to run on a large network of cooperating broker nodes. NaradaBrokering capabilities include, among other things, support for a wide variety of transport protocols, Java Message Service compliance, support for routing JXTA interactions, support for audio/video conferencing applications and, finally, support for multiple constraint specification formats such as XPath, SQL and regular expression queries. This paper demonstrates the suitability of NaradaBrokering to a wide variety of applications and scenarios. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Performance-based middleware for Grid computing | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.925 | ||||||||
Article ID | CPE925 | ||||||||
Author Name(s) | G. R. Nudd1S. A. Jarvis2 | ||||||||
Author Email(s) | stephen.jarvis@warwick.ac.uk2 | ||||||||
Affiliation(s) | High Performance Systems Group, Department of Computer Science, University of Warwick, Coventry CV4 7AL, U.K. 1 2 | ||||||||
Keyword(s) | Grid computing, middleware, performance prediction, scheduling, quality of service, | ||||||||
Abstract |
|||||||||
This paper describes a stateful service-oriented middleware infrastructure for the management of scientific tasks running on multi-domain heterogeneous distributed architectures. Allocating scientific workload across multiple administrative boundaries is a key issue in Grid computing and as a result a number of supporting services including match-making, scheduling and staging have been developed. Each of these services allows the scientist to utilize the available resources, although a sustainable level of service in such shared environments cannot always be guaranteed. A performance-based middleware infrastructure is described in which prediction data for each scientific task are calculated, stored and published through a Globus-based performance information service. Distributing these data allows additional performance-based middleware services to be built, two of which are described in this paper: an intra-domain predictive co-scheduler and a multi-domain workload steering system. These additional facilities significantly improve the ability of the system to meet task deadlines, as well as enhancing inter-domain load-balancing and system-wide resource utilization. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Performance control of scientific coupled models in Grid environments | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.926 | ||||||||
Article ID | CPE926 | ||||||||
Author Name(s) | Christopher Armstrong1Rupert W. Ford2John R. Gurd3Mikel Luján4Kenneth R. Mayes5Graham D. Riley6 | ||||||||
Author Email(s) | griley@cs.man.ac.uk6 | ||||||||
Affiliation(s) | Centre for Novel Computing, University of Manchester, Oxford Road, Manchester M13 9PL, U.K. 1 2 3 4 5 6 | ||||||||
Keyword(s) | Grid computing, performance control, component-oriented programming, coupled models, distributed computing, | ||||||||
Abstract |
|||||||||
In recent years, there has been increasing interest in the development of computer simulations of complex biological systems, and of multi-physics and multi-scale physical phenomena. Applications have been developed that involve the coupling together of separate executable models of individual systems, where these models may have been developed in isolation. A lightweight yet general solution is required to problems of linking coupled models, and of handling the incompatibilities between interacting models that arise from their diverse origins and natures. Many such models require high-performance computers to provide acceptable execution times, and there is increasing interest in utilizing Grid technologies. However, Grid applications need the ability to cope with heterogeneous and dynamically changing execution environments, particularly where run-time changes can affect application performance. A general coupling framework (GCF) is described that allows the construction of flexible coupled models. This approach results in a component-based implementation of a coupled model application. A semi-formal presentation of GCF is given. Components under GCF are separately deployable and coupled by simple data flows, making them appropriate structures for dynamic execution platforms such as the Grid. The design and initial implementation of a performance control system (PERCO) is reviewed. PERCO acts by redeploying components, and is thus appropriate for controlling GCF coupled model applications. Redeployment decisions in PERCO require performance prediction capabilities. A proof-of-concept performance prediction algorithm is presented, based on the descriptions of GCF and PERCO. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | The role of performance engineering techniques in the context of the Grid | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.928 | ||||||||
Article ID | CPE928 | ||||||||
Author Name(s) | A. J. G. Hey1J. Papay2M. Surridge3 | ||||||||
Author Email(s) | jp@ecs.soton.ac.uk2 | ||||||||
Affiliation(s) | EPSRC, Polaris House, North Star Avenue, Swindon SN2 1ET, U.K. 1ECS, University of Southampton, Southampton SO17 1BJ, U.K. 2IT Innovation Centre, 2 Venture Road, Chilworth Science Park, Southampton SO16 7NP, U.K. 3 | ||||||||
Keyword(s) | performance engineering, Grid, scheduling, benchmarking, meta-applications, neuro-fuzzy models, financial transaction processing, | ||||||||
Abstract |
|||||||||
Performance engineering can be described as a collection of techniques and methodologies whose aim is to provide reliable prediction, measurement and validation of the performance of applications on a variety of computing platforms. This paper reviews techniques for performance estimation and performance engineering developed at the University of Southampton and presents application case studies in task scheduling for engineering meta-applications, and capacity engineering for a financial transaction processing system. These show that it is important to describe performance in terms of a resource model, and that the choice of models may have to trade accuracy for utility in addressing the operational issues. We then present work from the on-going EU funded Grid project GRIA, and show how lessons learned from the earlier work have been applied to support a viable business model for Grid service delivery to a specified quality of service level. The key in this case is to accept the limitations of performance estimation methods, and design business models that take these limitations into account rather than attempting to provide hard guarantees over performance. We conclude by identifying some of the key lessons learned in the course of our work over many years and suggest possible directions for future investigations. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | ASKALON: a tool set for cluster and Grid computing | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.929 | ||||||||
Article ID | CPE929 | ||||||||
Author Name(s) | Thomas Fahringer1Alexandru Jugravu2Sabri Pllana3Radu Prodan4Clovis Seragiotto5Hong-Linh Truong6 | ||||||||
Author Email(s) | thomas.fahringer@uibk.ac.at1 | ||||||||
Affiliation(s) | Institute for Computer Science, University of Innsbruck, Technikerstrasse 25/7, A-6020 Innsbruck, Austria 1Institute for Software Science, University of Vienna, Liechtensteinstrasse 22, A-1090 Vienna, Austria 2 3 4 5 6 | ||||||||
Keyword(s) | cluster computing, Grid computing, parallel and distributed applications, performance prediction, measurement and analysis, bottleneck detection, experiment management, | ||||||||
Abstract |
|||||||||
Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment management, data collection, and bottleneck analysis. There is no evidence so far that all of these phases should/can be integrated into a single monolithic tool. Moreover, the emergence of computational Grids as a common single wide-area platform for high-performance computing raises the idea to provide tools as interacting Grid services that share resources, support interoperability among different users and tools, and, most importantly, provide omnipresent services over the Grid. We have developed the ASKALON tool set to support performance-oriented development of parallel and distributed (Grid) applications. ASKALON comprises four tools, coherently integrated into a service-oriented architecture. SCALEA is a performance instrumentation, measurement, and analysis tool of parallel and distributed applications. ZENTURIO is a general purpose experiment management tool with advanced support for multi-experiment performance analysis and parameter studies. AKSUM provides semi-automatic high-level performance bottleneck detection through a special-purpose performance property specification language. The PerformanceProphet enables the user to model and predict the performance of parallel applications at the early stages of development. In this paper we describe the overall architecture of the ASKALON tool set and outline the basic functionality of the four constituent tools. The structure of each tool is based on the composition and sharing of remote Grid services, thus enabling tool interoperability. In addition, a data repository allows the tools to share the common application performance and output data that have been derived by the individual tools. A service repository is used to store common portable Grid service implementations. A general-purpose Factory service is employed to create service instances on arbitrary remote Grid sites. Discovering and dynamically binding to existing remote services is achieved through registry services. The ASKALON visualization diagrams support both online and post-mortem visualization of performance and output data. We demonstrate the usefulness and effectiveness of ASKALON by applying the tools to real-world applications. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Automatic performance analysis tools for the Grid | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.930 | ||||||||
Article ID | CPE930 | ||||||||
Author Name(s) | M. Gerndt1 | ||||||||
Author Email(s) | m.gerndt@in.tum.de1 | ||||||||
Affiliation(s) | Institut für Informatik X, Technische Universität München, 85748 Garching, Germany 1 | ||||||||
Keyword(s) | performance analysis, Grid computing, parallel/distributed systems, | ||||||||
Abstract |
|||||||||
Applications on Grids require scalable and online performance analysis tools. The execution environment of such applications includes a large number of processors. In addition, some of the resources such as the network will be shared with other applications. This requires applications to adapt dynamically to resource changes. The article presents the requirements of Grid application classes for performance analysis tools. It introduces a new analysis environment currently being developed for teraflop computers within the Peridot project at Technische Universität München. This environment applies a distributed automatic performance analysis approach. It is based on a formal specification of performance properties in the APART specification language. It uses a hierarchy of analysis agents that obtain performance data from a configurable monitoring system. This scalable design allows performance analysis for large Grid applications. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Distributed computing in practice: the Condor experience | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.938 | ||||||||
Article ID | CPE938 | ||||||||
Author Name(s) | Douglas Thain1Todd Tannenbaum2Miron Livny3 | ||||||||
Author Email(s) | thain@cs.wisc.edu1 | ||||||||
Affiliation(s) | Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI 53706, U.S.A. 1 2 3 | ||||||||
Keyword(s) | Condor, Grid, history, community, planning, scheduling, split execution, | ||||||||
Abstract |
|||||||||
Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course travelled by research ideas as they grow into production systems. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | The design and implementation of Grid database services in OGSA-DAI | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 2-4 | ||||||||
Date | 02 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.939 | ||||||||
Article ID | CPE939 | ||||||||
Author Name(s) | Mario Antonioletti1Malcolm Atkinson2Rob Baxter3Andrew Borley4Neil P. Chue Hong5Brian Collins6Neil Hardman7Alastair C. Hume8Alan Knox9Mike Jackson10Amy Krause11Simon Laws12James Magowan13Norman W. Paton14Dave Pearson15Tom Sugden16Paul Watson17Martin Westhead18 | ||||||||
Author Email(s) | paul.watson@newcastle.ac.uk17 | ||||||||
Affiliation(s) | EPCC, University of Edinburgh, James Clerk Maxwell Building, Mayfield Road, Edinburgh EH9 3JZ, U.K. 1National e-Science Centre, 15 South College Street, Edinburgh EH8 9AA, U.K. 2 3 IBM United Kingdom Ltd, Hursley Park, Winchester SO21 2JN, U.K. 4 5 6 7 8 9 10 11 12 13 Department of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, U.K. 14Oracle UK Ltd, Thames Valley Park, Reading RG6 1RA, U.K. 15 16 School of Computing Science, University of Newcastle, Newcastle-upon-Tyne NE1 7RU, U.K. 17 18 | ||||||||
Keyword(s) | Grid, databases, e-Science, Web Services, | ||||||||
Abstract |
|||||||||
http://www.ogsadai.org.uk | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Adding tuples to Java: a study in lightweight data structures | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.835 | ||||||||
Article ID | CPE835 | ||||||||
Author Name(s) | C. van Reeuwijk1H. J. Sips2 | ||||||||
Author Email(s) | c.vanreeuwijk@cs.tudelft.nl1 | ||||||||
Affiliation(s) | Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands 1 2 | ||||||||
Keyword(s) | tuple, Java, lightweight data structures, | ||||||||
Abstract |
|||||||||
lightweight tuples | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Object combining: a new aggressive optimization for object intensive programs | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.836 | ||||||||
Article ID | CPE836 | ||||||||
Author Name(s) | Ronald Veldema1Ceriel J. H. Jacobs2Rutger F. H. Hofman3Henri E. Bal4 | ||||||||
Author Email(s) | veldema@cs.fau.de1 | ||||||||
Affiliation(s) | Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands 1 2 3 4 | ||||||||
Keyword(s) | Java, garbage collection, object management, | ||||||||
Abstract |
|||||||||
Object combining tries to put objects together that have roughly the same life times in order to reduce strain on the memory manager and to reduce the number of pointer indirections during a program's execution. Object combining works by appending the fields of one object to another, allowing allocation and freeing of multiple objects with a single heap (de)allocation. Unlike object inlining, which will only optimize objects where one has a (unique) pointer to another, our optimization also works if there is no such relation. Object inlining also directly replaces the pointer by the inlined object's fields. Object combining leaves the pointer in place to allow more combining. Elimination of the pointer accesses is implemented in a separate compiler optimization pass. Unlike previous object inlining systems, reference field overwrites are allowed and handled, resulting in much more aggressive optimization. Our object combining heuristics also allow unrelated objects to be combined, for example, those allocated inside a loop; recursive data structures (linked lists, trees) can be allocated several at a time and objects that are always used together can be combined. As Java explicitly permits code to be loaded at runtime and allows the new code to contribute to a running computation, we do not require a closed-world assumption to enable these optimizations (but it will increase performance). The main focus of object combining in this paper is on reducing object (de)allocation overhead, by reducing both garbage collection work and the number of object allocations. Reduction of memory management overhead causes execution time to be reduced by up to 35%. Indirection removal further reduces execution time by up to 6%. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | UMM: an operational memory model specification framework with integrated model checking capability | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.837 | ||||||||
Article ID | CPE837 | ||||||||
Author Name(s) | Yue Yang1Ganesh Gopalakrishnan2Gary Lindstrom3 | ||||||||
Author Email(s) | yyang@cs.utah.edu1 | ||||||||
Affiliation(s) | School of Computing, University of Utah, Salt Lake City, UT 84112, U.S.A. 1 2 3 | ||||||||
Keyword(s) | memory model, operational specification, Java thread, formal verification, | ||||||||
Abstract |
|||||||||
Given the complicated nature of modern shared memory systems, it is vital to have a systematic approach to specifying and analyzing memory consistency requirements. In this paper, we present the UMM specification framework, which integrates two key features to support memory model verification: (i) it employs a simple and generic memory abstraction that can capture a large collection of memory models as guarded commands with a uniform notation, and (ii) it provides built-in model checking capability to enable formal reasoning about thread behaviors. Using this framework, memory models can be specified in a parameterized style - designers can simply redefine a few bypassing rules and visibility ordering rules to obtain an executable specification of another memory model. We formalize several classical memory models, including Sequential Consistency, Coherence, and PRAM, to illustrate the general techniques of applying this framework. We then provide an alternative specification of the Java memory model, based on a proposal from Manson and Pugh, and demonstrate how to analyze Java thread semantics using model checking. We also compare our operational specification style with axiomatic specification styles and explore a mechanism that converts a memory model definition from one style to the other.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Ravenscar-Java: a high-integrity profile for real-time Java | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.843 | ||||||||
Article ID | CPE843 | ||||||||
Author Name(s) | Jagun Kwon1Andy Wellings2Steve King3 | ||||||||
Author Email(s) | jagun@cs.york.ac.uk1 | ||||||||
Affiliation(s) | Real-Time Systems Research Group, Department of Computer Science, University of York, Heslington, York YO10 5DD, U.K. 1 2 3 | ||||||||
Keyword(s) | high-integrity systems, real-time Java, profile, | ||||||||
Abstract |
|||||||||
For many, Java is the antithesis of a high-integrity programming language. Its combination of object-oriented programming features, its automatic garbage collection, and its poor support for real-time multi-threading are all seen as particular impediments. The Real-Time Specification for Java has introduced many new features that help in the real-time domain. However, the expressive power of these features means that very complex programming models can be created, necessitating complexity in the supporting real-time virtual machine. Consequently, Java, with the real-time extensions as they stand, seems too complex for confident use in high-integrity systems. This paper presents a Java profile for the development of software-intensive high-integrity real-time systems. This restricted programming model removes language features with high overheads and complex semantics, on which it is hard to perform timing and functional analyses. The profile fits within the J2ME framework and is consistent with well-known guidelines for high-integrity software development, such as those defined by the U.S. Nuclear Regulatory Commission.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Improving the memory management performance of RTSJ | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.845 | ||||||||
Article ID | CPE845 | ||||||||
Author Name(s) | M. Teresa Higuera-Toledano1Valerie Issarny2 | ||||||||
Author Email(s) | mthiguer@dacya.ucm.es1 | ||||||||
Affiliation(s) | Universidad Complutense de Madrid, Ciudad Universitaria, Madrid 28040, Spain 1INRIA-Rocquencourt, Domaine de Voluceau, BP 105, 78153 Le Chesnay Cedex, France 2 | ||||||||
Keyword(s) | Java, real-time, embedded systems, garbage collection, memory regions, Java microprocessor, write barriers, performance, | ||||||||
Abstract |
|||||||||
From a real-time perspective, the garbage collector (GC) introduces unpredictable pauses that are not tolerated by real-time tasks. Real-time collectors eliminate this problem but introduce a high overhead. Another approach is to use memory regions (MRs) within which allocation and deallocation is customized. This facility is supported by the memory model of the Real-Time Specification for Java (RTSJ). RTSJ imposes strict access and assignment rules to avoid both the dangling inter-region references and the delays of critical tasks of the GC. A dynamic check solution can incur high overhead, which can be reduced by taking advantage of hardware features. This paper provides an in-depth analytical investigation of the overhead introduced by dynamic assignments checks in RTSJ, describing and analysing several solutions to reduce the introduced overhead. Copyright ? 2005 John Wiley © Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Elimination of Java array bounds checks in the presence of indirection | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.847 | ||||||||
Article ID | CPE847 | ||||||||
Author Name(s) | Mikel Luj?n1John R. Gurd2T. L. Freeman3Jos? Miguel4 | ||||||||
Author Email(s) | jagun@cs.york.ac.uk1 | ||||||||
Affiliation(s) | Centre for Novel Computing, University of Manchester, Oxford Road, Manchester M13 9PL, U.K. 1 2 3 Departamento de Arquitectura y Tecnolog?a de Computadores, UPV-EHU, P.O. Box 649, 20080 Donostia-San Sebastian, Spain 4 | ||||||||
Keyword(s) | array bounds check, Java, array indirection, | ||||||||
Abstract |
|||||||||
The Java language specification states that every access to an array needs to be within the bounds of that array, i.e. between 0 and array length - 1. Different techniques for different programming languages have been proposed to eliminate explicit bounds checks. Some of these techniques are implemented in off-the-shelf Java Virtual Machines (JVMs). The underlying principle of these techniques is that bounds checks can be removed when a JVM/compiler has enough information to guarantee that a sequence of accesses (e.g. inside a for-loop) is safe (within the bounds). Most of the techniques for the elimination of array bounds checks have been developed for programming languages that do not support multi-threading and/or enable dynamic class loading. These two characteristics make most of these techniques unsuitable for Java. Techniques developed specifically for Java have not addressed the elimination of array bounds checks in the presence of indirection; that is, when the index is stored in another array (indirection array). With the objective of optimizing applications with array indirection, this paper proposes and evaluates three implementation strategies, each implemented as a Java class. The classes provide the functionality of Java arrays of type int so that objects of the classes can be used instead of indirection arrays. Each strategy enables JVMs, when examining only one of these classes at a time, to obtain enough information to remove array bounds checks.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Run-time evaluation of opportunities for object inlining in Java | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.848 | ||||||||
Article ID | CPE848 | ||||||||
Author Name(s) | Ondrej Lhot?k1Laurie Hendren2 | ||||||||
Author Email(s) | olhotak@sable.mcgill.ca1 | ||||||||
Affiliation(s) | Sable Research Group, School of Computer Science, McGill University, Montreal, Canada 1 2 | ||||||||
Keyword(s) | object inlining, Java, compilers, optimization, | ||||||||
Abstract |
|||||||||
Object-oriented languages, such as Java, encourage the use of many small objects linked together by field references, instead of a few monolithic structures. While this practice is beneficial from a program design perspective, it can slow down program execution by incurring many pointer indirections. One solution to this problem is object inlining: when the compiler can safely do so, it fuses small objects together, thus removing the reads/writes to the removed field, saving the memory needed to store the field and object header, and reducing the number of object allocations. The objective of this paper is to measure the potential for object inlining by studying the run-time behaviour of a comprehensive set of Java programs. We study the traces of program executions in order to determine which fields behave like inlinable fields. Since we are using dynamic information instead of a static analysis, our results give an upper bound on what could be achieved via a static compiler-based approach. Our experimental results measure the potential improvements attainable with object inlining, including reductions in the numbers of field reads and writes, and reduced memory usage. Our study shows that some Java programs can benefit significantly from object inlining, with close to a 10% speedup. Somewhat to our surprise, our study found one case, the db benchmark, where the most important inlinable field was the result of unusual program design, and fixing this small flaw led to both better performance and clearer program design. However, the opportunities for object inlining are highly dependent on the individual program being considered, and are in many cases very limited. Furthermore, fields that are inlinable also have properties that make them potential candidates for other optimizations such as removing redundant memory accesses. The memory savings possible through object inlining are moderate. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Jeeg: temporal constraints for the synchronization of concurrent objects | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.849 | ||||||||
Article ID | CPE849 | ||||||||
Author Name(s) | Giuseppe Milicia1Vladimiro Sassone2 | ||||||||
Author Email(s) | milicia@brics.dk1 | ||||||||
Affiliation(s) | Basic Research in Computer Science (BRICS), University of Aarhus, IT-parken Aabogade 34, DK-8200 Aarhus, Denmark 1Informatics, University of Sussex, Brighton BN1 9QH, U.K. 2 | ||||||||
Keyword(s) | Java, inheritance anomaly, temporal logic, | ||||||||
Abstract |
|||||||||
We introduce Jeeg, a dialect of Java based on a declarative replacement of the synchronization mechanisms of Java that results in a complete decoupling of the ‘business’ and the ‘synchronization’ code of classes. Synchronization constraints in Jeeg are expressed in a linear temporal logic, which allows one to effectively limit the occurrence of the inheritance anomaly that commonly affects concurrent object-oriented languages. Jeeg is inspired by the current trend in aspect-oriented languages. In a Jeeg program the sequential and concurrent aspects of object behaviors are decoupled: specified separately by the programmer, these are then weaved together by the Jeeg compiler. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Compiling almost-whole Java programs | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.850 | ||||||||
Article ID | CPE850 | ||||||||
Author Name(s) | Zoran Budimlic1Ken Kennedy2 | ||||||||
Author Email(s) | zoran@cs.rice.edu1 | ||||||||
Affiliation(s) | Center for High Performance Software Research, Rice University, Houston, TX 77005, U.S.A. 1 2 | ||||||||
Keyword(s) | whole-program complition, almost-whole-program compilation, Java, optimization, | ||||||||
Abstract |
|||||||||
This paper presents a strategy, called almost-whole-program compilation, for extending the benefits of whole-program optimization to large collections of Java components that are packaged as a group after the development phase. This strategy has been implemented in a framework that uses Java visibility and scoping rules to transform a collection of classes into a package that is amenable to whole-program optimizations, without precluding extensions to the optimized and compiled code. Thus, it enables the Java developer to balance performance against flexibility of the program after the development phase, without compromising the design process. The transformation is shown to incur only modest performance penalties, which are more than compensated for by the interprocedural optimizations it enables. The paper concludes with experimental results showing the benefits that can be achieved using this approach. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Recurrence analysis for effective array prefetching in Java | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.851 | ||||||||
Article ID | CPE851 | ||||||||
Author Name(s) | Brendon Cahoon1Kathryn S. McKinley2 | ||||||||
Author Email(s) | mckinley@cs.utexas.edu2 | ||||||||
Affiliation(s) | Department of Computer Sciences, University of Texas at Austin, Austin, TX 78757, U.S.A. 1 2 | ||||||||
Keyword(s) | memory optimization, array prefetching, static analysis, | ||||||||
Abstract |
|||||||||
Java is an attractive choice for numerical, as well as other, algorithms due to the software engineering benefits of object-oriented programming. Because numerical programs often use large arrays that do not fit in the cache, they suffer from poor memory performance. To hide memory latency, we describe a new unified compile-time analysis for software prefetching arrays and linked structures in Java. Our previous work used data-flow analysis to discover linked data structure accesses. We generalize our prior approach to identify loop induction variables as well, which we call recurrence analysis. Our algorithm schedules prefetches for all array references that contain induction variables. We evaluate our technique using a simulator of an out-of-order superscalar processor running a set of array-based Java programs. Across all of our programs, prefetching reduces execution time by a geometric mean of 23%, and the largest improvement is 58%. We also evaluate prefetching on a PowerPC processor, and we show that prefetching reduces execution time by a geometric mean of 17%. Because our analysis is much simpler and quicker than previous techniques, it is suitable for including in a just-in-time compiler. Traditional software prefetching algorithms for C and Fortran use locality analysis and sophisticated loop transformations. We further show that the additional loop transformations and careful scheduling of prefetches from previous work are not always necessary for modern architectures and Java programs. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | The Open Runtime Platform: a flexible high-performance managed runtime environment An earlier version of this paper was published online 1 | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.852 | ||||||||
Article ID | CPE852 | ||||||||
Author Name(s) | Michal Cierniak1Marsha Eng2Neal Glew3Brian Lewis4James Stichnoth5 | ||||||||
Author Email(s) | michaljc@microsoft.com1 | ||||||||
Affiliation(s) | Microsoft Corporation, 1 Microsoft Way, Redmond, WA 98052, U.S.A. 1Microprocessor Technology Laboratory, Intel Corporation, 2200 Mission College Boulevard, Santa Clara, CA, U.S.A. 2 3 4 5 | ||||||||
Keyword(s) | MRTE, Java, CLI, virtual machine, interface design, | ||||||||
Abstract |
|||||||||
The Open Runtime Platform (ORP) is a high-performance managed runtime environment (MRTE) that features exact generational garbage collection, fast thread synchronization, and multiple coexisting just-in-time compilers (JITs). ORP was designed for flexibility in order to support experiments in dynamic compilation, garbage collection, synchronization, and other technologies. It can be built to run either Java or Common Language Infrastructure (CLI) applications, to run under the Windows or Linux operating systems, and to run on the IA-32 or Itanium processor family (IPF) architectures. Achieving high performance in a MRTE presents many challenges, particularly when flexibility is a major goal. First, to enable the use of different garbage collectors and JITs, each component must be isolated from the rest of the environment through a well-defined software interface. Without careful attention, this isolation could easily harm performance. Second, MRTEs have correctness and safety requirements that traditional languages such as C++ lack. These requirements, including null pointer checks, array bounds checks, and type checks, impose additional runtime overhead. Finally, the dynamic nature of MRTEs makes some traditional compiler optimizations, such as devirtualization of method calls, more difficult to implement or more limited in applicability. To get full performance, JITs and the core virtual machine (VM) must cooperate to reduce or eliminate (where possible) these MRTE-specific overheads. In this paper, we describe the structure of ORP in detail, paying particular attention to how it supports flexibility while preserving high performance. We describe the interfaces between the garbage collector, the JIT, and the core VM; how these interfaces enable multiple garbage collectors and JITs without sacrificing performance; and how they allow the JIT and the core VM to reduce or eliminate MRTE-specific performance issues. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Immutability specification and its applications | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.853 | ||||||||
Article ID | CPE853 | ||||||||
Author Name(s) | Igor Pechtchanski1Vivek Sarkar2 | ||||||||
Author Email(s) | igor@watson.ibm.com1 | ||||||||
Affiliation(s) | BM T. J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, U.S.A 1 2 | ||||||||
Keyword(s) | immutability, annotations, optimization, | ||||||||
Abstract |
|||||||||
A location is said to be immutable if its value and the values of selected locations reachable from it are guaranteed to remain unchanged during a specified time interval. We introduce a framework for immutability specification, and discuss its application to code optimization. Compared with a final declaration, an immutability assertion in our framework can express a richer set of immutability properties along three dimensions - lifetime, reachability and context. We present a framework for processing and verifying immutability annotations in Java, as well as extending optimizations so as to exploit immutability information. Preliminary experimental results show that a significant number (61%) of read accesses could potentially be classified as immutable in our framework. Further, use of immutability information yields substantial reductions (33-99%) in the number of dynamic read accesses, and also measurable speedups in the range of 5-10% for certain benchmark programs.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Experience in integrating Java with C# and .NET | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.858 | ||||||||
Article ID | CPE858 | ||||||||
Author Name(s) | Judith Bishop1R. Nigel Horspool2Basil Worrall3 | ||||||||
Author Email(s) | jbishop@cs.up.ac.za1 | ||||||||
Affiliation(s) | Computer Science Department, University of Pretoria, Pretoria 0002, South Africa 1Computer Science Department, University of Victoria, Victoria, BC, Canada V8W 3P6 2 3 | ||||||||
Keyword(s) | integration, Java, C#, XML, native code, Web services, GUI, | ||||||||
Abstract |
|||||||||
Java programmers cannot help but be aware of the advent of C#, the .NET network environment, and a host of new supporting technologies, such as Web services. Before taking the big step of moving all development to a new environment, programmers will want to know what are the advantages of C# as a language over Java, and whether the new and interesting features of C# and .NET can be incorporated into existing Java software. This paper surveys the advantages of C# and then presents and evaluates experience with connecting it to Java in a variety of ways. The first way provides evidence that Java can be linked to C# at the native code level, albeit through C++ wrappers. The second is a means for retaining the useful applet feature of Java in the server-side architecture of Web services written in C#. The third is by providing a common XML-based class for the development of graphical user interfaces (GUIs), which can be incorporated into Java or C#. An added advantage of this system, called Views, is that it can run independently of the resource-intensive development environment that would otherwise be needed for using C#. A major advantage of the methods described in this paper is that in all cases the Java program is not affected by the fact that it is interfacing with C#. The paper concludes that there are many common shared technologies that bring Java and C# close together, and that innovative ways of using others can open up opportunities not hitherto imagined. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Editorial | ||||||||
Article Title | Special Issue: ACM 2002 Java Grande-ISCOPE Conference | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 5-6 | ||||||||
Date | 04 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.933 | ||||||||
Article ID | CPE933 | ||||||||
Author Name(s) | Geoffrey Fox1 | ||||||||
Author Email(s) | gcf@indiana.edu1 | ||||||||
Affiliation(s) | 1 | ||||||||
Keyword(s) | ., | ||||||||
Abstract |
|||||||||
No abstract | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Support and optimization of Java RMI over a Bluetooth environment | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.838 | ||||||||
Article ID | CPE838 | ||||||||
Author Name(s) | Pu-Chen Wei1Chung-Hsin Chen2Cheng-Wei Chen3Jenq-Kuen Lee4 | ||||||||
Author Email(s) | jklee@pllab.cs.nthu.edu.tw1 | ||||||||
Affiliation(s) | Department of Computer Science, National Tsing Hua University, HsinChu 30055, Taiwan 1 2 3 4 | ||||||||
Keyword(s) | distributed object-oriented computing, wireless computing, Java RMI, Bluetooth architectures, collaborative computing, | ||||||||
Abstract |
|||||||||
Distributed object-oriented platforms are increasingly important over wireless environments for providing frameworks for collaborative computations and for managing a large pool of distributed resources. Due to limited bandwidths and heterogeneous architectures of wireless devices, studies are needed into supporting object-oriented frameworks over heterogeneous wireless environments and optimizing system performance. In our research work, we are working towards efficiently supporting object-oriented environments over heterogeneous wireless environments. In this paper, we report the issues and our research results related to the efficient support of Java RMI over a Bluetooth environment. In our work, we first implement support for Java RMI over Bluetooth protocol stacks, by incorporating a set of protocol stack layers for Bluetooth developed by us (which we call JavaBT) and by supporting the L2CAP layer with sockets that support the RMI socket. In addition, we model the cost for the access patterns of Java RMI communications. This cost model is used to guide the formation and optimizations of the scatternets of a Java RMI Bluetooth environment. In our approach, we employ the well-known BTCP algorithm to observe initial configurations for the number of piconets. Using the communication-access cost as a criterion, we then employ a spectral-bisection method to cluster the nodes in a piconet and then use a bipartite matching scheme to form the scatternet. Experimental results with the prototypes of Java RMI support over a Bluetooth environment show that our scatternet-formation algorithm incorporating an access-cost model can further optimize the performances of such as system.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | CSFS: a Java enabled network file storage system | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.839 | ||||||||
Article ID | CPE839 | ||||||||
Author Name(s) | Han Hua1Dai Yafei2Li Xiaoming3 | ||||||||
Author Email(s) | hh@net.pku.edu.cn1 | ||||||||
Affiliation(s) | Computer Networks and Distributed Systems Laboratory, Department of Computer Science and Technology, Peking University, Beijing 100871, People's Republic of China 1 2 3 | ||||||||
Keyword(s) | distributed file system, network file storage, namespace, | ||||||||
Abstract |
|||||||||
The CSFS (cryptographic storage file system) is a network file storage system that is suitable for a small-to-medium sized networks such as a campus network. In a CSFS, a lot of distributed file servers are organized in a star-liked architecture. File names are stored in a name server and file data are stored in distributed file servers. A CSFS has good scalability and is able to accommodate hundreds of file servers. We implemented a CSFS in pure Java and tested the system with a benchmark. The test results show that CSFS delivers acceptable performance for general file operations. Its performance of file upload and download is as efficient as FTP. It can support more than 450 concurrent online users.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | JavaSymphony: a new programming paradigm to control and synchronize locality, parallelism and load balancing for parallel and distributed computing | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.840 | ||||||||
Article ID | CPE840 | ||||||||
Author Name(s) | Thomas Fahringer1Alexandru Jugravu2 | ||||||||
Author Email(s) | thomas.fahringer@uibk.ac.at1 | ||||||||
Affiliation(s) | Institute for Software Science, University ofVienna, Liechtensteinstrasse 22, A-1090 Vienna, Austria 1 2 | ||||||||
Keyword(s) | distributed Java, parallel and concurrent computing, locality, load balancing, parallelism, | ||||||||
Abstract |
|||||||||
There has been an increasing research interest in extending the use of Java towards performance-oriented programming for distributed and concurrent applications. JavaSymphony is a Java-based programming paradigm that allows the programmer to control parallelism, load balancing, and locality at a high level of abstraction. Objects can be explicitly distributed and migrated based on a high-level API to static/dynamic system parameters and dynamic virtual distributed architectures, which impose a virtual hierarchy on a distributed system of physical computing nodes. In this paper we describe various extensions to the original JavaSymphony API, which includes a generalization of virtual architectures that can be used to specify and to request arbitrary heterogeneous distributed and concurrent architectures inside of a JavaSymphony program. The number of threads that execute an object's methods can be controlled dynamically through single- and multi-threaded objects. Conventional Java objects can be dynamically converted to JavaSymphony objects. A (un)lock mechanism has been introduced in order to avoid inconsistent modifications of objects or virtual architectures. A sophisticated event mechanism for asynchronous communication, coordination, and interaction is provided. Several synchronization constructs including distributed barrier synchronization and synchronization for asynchronous method invocations have been included. Several experiments are presented to demonstrate the effectiveness and efficiency of JavaSymphony.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Enterprise JavaBeans caching in clustered environments | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.841 | ||||||||
Article ID | CPE841 | ||||||||
Author Name(s) | Avraham Leff1James T. Rayfield2 | ||||||||
Author Email(s) | avraham@us.ibm.com1 | ||||||||
Affiliation(s) | IBM T. J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, U.S.A. 1 2 | ||||||||
Keyword(s) | Enterprise JavaBeans (EJBs), EJB caching, performance analysis, transactional caching, database offload, | ||||||||
Abstract |
|||||||||
We present the design of an Enterprise JavaBeans (EJBs) caching architecture, and show that EJB caching can greatly improve application throughput in clustered environments. Throughput is improved because data serving is offloaded from the database server to the cache-enabled application server. EJB caching is successful because it exploits the low-latency connections between the database server and application servers that exist in a clustered environment. An important feature of our architecture is that the caching function is transparent to applications that use it. The cache-enabled application server uses the same (EJB) programming model, and the same transactional semantics, as provided by non-caching architectures. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Object-oriented programming in peer-to-peer system | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.842 | ||||||||
Article ID | CPE842 | ||||||||
Author Name(s) | Patrick Th. Eugster1Sebastien Baehni2 | ||||||||
Author Email(s) | patrick.eugster@epfl.ch1 | ||||||||
Affiliation(s) | Distributed Programming Laboratory, Swiss Federal Institute of Technology, Lausanne, Switzerland 1 2 | ||||||||
Keyword(s) | borrow/lend, peer-to-peer, abstraction, type, Java, | ||||||||
Abstract |
|||||||||
Leveraged by the success of applications aiming at the free sharing of data in the Internet, the paradigm of peer-to-peer (P2P) computing has had substantial consideration devoted to it recently. This paper presents a high-level abstraction for remote object interaction in a P2P environment, called borrow/lend (BL). We present the principles underlying our BL abstraction, and illustrate how this abstraction can be used to program P2P applications in Java. We contrast our abstraction with established abstractions for distributed programming such as the remote method invocation or the tuple space, illustrating how the BL abstraction, obviously influenced by such previous abstractions, unifies flavors of these, but also how it captures the constraints specific to P2P environments.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Features from functional programming for a C++ skeleton library | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.844 | ||||||||
Article ID | CPE844 | ||||||||
Author Name(s) | J. Striegnitz1H. Kuchen2 | ||||||||
Author Email(s) | j.striegnitz@fz-juelich.de1 | ||||||||
Affiliation(s) | Research Center J?lich, Central Institute for Applied Mathematics, 52425 J?lich, Germany 1University of M?nster, Department of Information Systems, Leonardo Campus 3, 48149 M?nster, Germany 2 | ||||||||
Keyword(s) | algorithmic skeletons, C++, | ||||||||
Abstract |
|||||||||
Message passing based on libraries such as MPI is typically used to program parallel machines with distributed memory. This is efficient, but error prone. Algorithmic skeletons are intended to simplify parallel programming by increasing expressive power. The idea is to offer typical parallel programming patterns as polymorphic higher-order functions which are efficiently implemented in parallel. The present paper describes how C++ templates and operator overloading can be used in order to provide the main features needed for algorithmic skeletons, namely higher-order functions, partial applications and parametric polymorphism. Experimental results based on a draft implementation of our C++ skeleton library show that higher expressive power can be gained without a significant performance penalty.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | A method-level comparison of the Java Grande and SPEC JVM98 benchmark suites | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.846 | ||||||||
Article ID | CPE846 | ||||||||
Author Name(s) | David Gregg1James Power2John Waldron3 | ||||||||
Author Email(s) | james.power@may.ie2 | ||||||||
Affiliation(s) | Department of Computer Science, Trinity College, Dublin 2, Ireland 1Department of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland 2 3 | ||||||||
Keyword(s) | benchmark suites, Java Virtual Machine, dynamic profiling, | ||||||||
Abstract |
|||||||||
In this paper we seek to provide a foundation for the study of the level of use of object-oriented techniques in Java programs in general, and scientific applications in particular. Specifically, we investigate the profiles of Java programs from a number of perspectives, including the use of class library methods, the size of methods called, the mode of invoke instruction used and the polymorphicity of call sites. We also present a categorization of the nature of small methods used in Java programs. We compare the Java Grande and SPEC JVM98 benchmark suites, and note a significant difference in the nature and composition of these suites, with the programs from the Java Grande suite demonstrating a less object-oriented approach. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | JOPI: a Java object-passing interface | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.854 | ||||||||
Article ID | CPE854 | ||||||||
Author Name(s) | Jameela Al-Jaroodi1Nader Mohamed2Hong Jiang3David Swanson4 | ||||||||
Author Email(s) | nmohamed@stevens.edu2 | ||||||||
Affiliation(s) | Department of Electrical and Computer Engineering, Stevens Institute of Technology, Burchard 216, Castle Point on Hudson, Hoboken, NJ 07030, U.S.A. 1 2 Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0115, U.S.A. 3 4 | ||||||||
Keyword(s) | Java, heterogeneous systems, cluster, parallel programming, object-passing, object-oriented systems, | ||||||||
Abstract |
|||||||||
Recently there has been an increasing interest in developing parallel programming capabilities in Java to harness the vast resources available in clusters, grids and heterogeneous networked systems. In this paper, we introduce a Java object-passing interface (JOPI) library. JOPI provides Java programmers with the necessary functionality to write object-passing parallel programs in distributed heterogeneous systems. JOPI provides a Message Passing Interface (MPI)-like interface that can be used to exchange objects among processes. In addition to the well-known benefits of the object-oriented development model, using objects to exchange information in JOPI is advantageous because it facilitates passing complex structures and enables the programmer to isolate the problem space from the parallelization problem. The run-time environment for JOPI is portable, efficient and provides the necessary functionality to deploy and execute parallel Java programs. Experiments were conducted on a cluster system and a collection of heterogeneous platforms to measure JOPI's performance and compare it with MPI. The results show good performance gains using JOPI. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Advanced eager scheduling for Java-based adaptive parallel computing | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.855 | ||||||||
Article ID | CPE855 | ||||||||
Author Name(s) | Michael O. Neary1Peter Cappello2 | ||||||||
Author Email(s) | cappello@cs.ucsb.edu2 | ||||||||
Affiliation(s) | Department of Computer Science, University of California, Santa Barbara, CA 93106, U.S.A. 1 2 | ||||||||
Keyword(s) | concurrent programming, branch-and-bound, eager scheduling, fault tolerance, Grid computing, parallel computing, | ||||||||
Abstract |
|||||||||
Javelin 3 is a software system for developing large-scale, fault-tolerant, adaptively parallel applications. When all or part of their application can be cast as a master-worker or branch-and-bound computation, Javelin 3 frees application developers from concerns about inter-processor communication and fault tolerance among networked hosts, allowing them to focus on the underlying application. The paper describes a fault-tolerant task scheduler and its performance analysis. The task scheduler integrates work stealing with an advanced form of eager scheduling. It enables dynamic task decomposition, which improves host load-balancing in the presence of tasks whose non-uniform computational load is evident only at execution time. Speedup measurements are presented of actual performance on up to 1000 hosts. We analyze the expected performance degradation due to unresponsive hosts, and measure actual performance degradation due to unresponsive hosts. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Implementation and performance of a particle-in-cell code written in Java | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.856 | ||||||||
Article ID | CPE856 | ||||||||
Author Name(s) | G. Lapenta1S. Markidis2W. B. VanderHeyden3Z. Budimlic4 | ||||||||
Author Email(s) | lapenta@lanl.gov1 | ||||||||
Affiliation(s) | Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A. 1 2 3 4 | ||||||||
Keyword(s) | PIC method, object-oriented programming, high-performance computing, Java, | ||||||||
Abstract |
|||||||||
Plasma simulation is an important example of a high-performance computing application where computer science issues are of great relevance. In a plasma, each particle, electron or ion, interacts with the external fields and with other particles in ways that can be readily and effectively emulated using object-oriented programming. However, the great cost of plasma simulations has traditionally discouraged object-oriented implementations due to their perceived inferior performance compared with classic procedural FORTRAN or C. In the present paper, we revisit this issue. We have developed a Java particle-in-cell code for plasma simulation, called Parsek. The paper considers different choices for the object orientation and tests their performance. We find that coarse-grained object orientation is faster and practically immune from any degradation compared with a standard procedural implementation (with static classes). The loss in performance for a fine-grained object orientation is a factor of about 50%, which can be almost completely eliminated using advanced Java compilation techniques. The Java code Parsek also provides an interesting realistic application of high-performance computing to compare the performance of Java with FORTRAN. We have conducted a series of tests considering various Java implementations and various FORTRAN implementations. We have also considered different computer architectures and different Java Virtual Machines and FORTRAN compilers. The conclusion is that with Parsek, object-oriented Java can reach CPU speed performances more or less comparable with procedural FORTRAN. This conclusion is remarkable and it is in agreement with the most recent benchmarks, but is at variance with widely held misconceptions about the alleged slowness of Java.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | On the conditions necessary for removing abstraction penalties in OOLALA | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.857 | ||||||||
Article ID | CPE857 | ||||||||
Author Name(s) | Mikel Luján1T. L. Freeman2John R. Gurd3 | ||||||||
Author Email(s) | mlujan@cs.man.ac.uk1 | ||||||||
Affiliation(s) | Centre for Novel Computing, University of Manchester, Oxford Road, Manchester M13 9PL, U.K. 1 2 3 | ||||||||
Keyword(s) | object-oriented programming, numerical linear algebra, Java, abstraction penalty, | ||||||||
Abstract |
|||||||||
OOLALA is an object-oriented linear algebra library designed to reduce the effort of software development and maintenance. In contrast with traditional (Fortran-based) libraries, it provides two high abstraction levels that significantly reduce the number of implementations necessary for particular linear algebra operations. Initial performance evaluations of a Java implementation of OOLALA show that the two high abstraction levels are not competitive with the low abstraction level of traditional libraries. These initial performance results motivate the present contribution - the characterization of a set of storage formats (data structures) and matrix properties (special features) for which implementations at the two high abstraction levels can be transformed into implementations at the low (more efficient) abstraction level.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Collective communication for the HPJava programming language | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.859 | ||||||||
Article ID | CPE859 | ||||||||
Author Name(s) | Han-ku Lee1Sang Boem Lim2Bryan Carpenter3Geoffrey Fox4 | ||||||||
Author Email(s) | hlee@konkuk.ac.kr1 | ||||||||
Affiliation(s) | School of Internet and Multimedia Engineering, Konkuk University, Seoul 143-701, Korea 1Pervasive Technology Labs, Indiana University, Bloomington, IN 47404, U.S.A. 2 3 4 | ||||||||
Keyword(s) | HPspmd programming model, HPJava, Adlib, Java, | ||||||||
Abstract |
|||||||||
This paper addresses functionality and implementation of a HPJava version of the Adlib collective communication library for data parallel programming. We begin by illustrating typical use of the library, through an example multigrid application. Then we describe implementation issues for the high-level library. At a software engineering level, we illustrate how the primitives of the HPJava language assist in writing library methods whose implementation can be largely independent of the distribution format of the argument arrays. We also describe a low-level API called mpjdev, which handles basic communication underlying the Adlib implementation. Finally we present some benchmark results, and some conclusions.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Ibis: a flexible and efficient Java-based Grid programming environment | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.860 | ||||||||
Article ID | CPE860 | ||||||||
Author Name(s) | Rob V. van Nieuwpoort1Jason Maassen2Gosia Wrzesinska3Rutger F. H Hofman4Ceriel J. H. Jacobs5Thilo Kielmann6Henri E. Bal7 | ||||||||
Author Email(s) | rob@cs.vu.nl1 | ||||||||
Affiliation(s) | aculty of Sciences, Department of Computer Science, Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands 1 2 3 4 5 6 7 | ||||||||
Keyword(s) | Grid computing, Java, portability, performance, | ||||||||
Abstract |
|||||||||
In computational Grids, performance-hungry applications need to simultaneously tap the computational power of multiple, dynamically available sites. The crux of designing Grid programming environments stems exactly from the dynamic availability of compute cycles: Grid programming environments (a) need to be portable to run on as many sites as possible, (b) they need to be flexible to cope with different network protocols and dynamically changing groups of compute nodes, while (c) they need to provide efficient (local) communication that enables high-performance computing in the first place. Existing programming environments are either portable (Java), or flexible (Jini, Java Remote Method Invocation or (RMI)), or they are highly efficient (Message Passing Interface). No system combines all three properties that are necessary for Grid computing. In this paper, we present Ibis, a new programming environment that combines Java's run everywhere portability both with flexible treatment of dynamically available networks and processor pools, and with highly efficient, object-based communication. Ibis can transfer Java objects very efficiently by combining streaming object serialization with a zero-copy protocol. Using RMI as a simple test case, we show that Ibis outperforms existing RMI implementations, achieving up to nine times higher throughputs with trees of objects.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Using MPI with C# and the Common Language Infrastructure | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.861 | ||||||||
Article ID | CPE861 | ||||||||
Author Name(s) | Jeremiah Willcock1Andrew Lumsdaine2Arch Robison3 | ||||||||
Author Email(s) | lums@osl.iu.edu2 | ||||||||
Affiliation(s) | Indiana University Open Systems Laboratory, Bloomington, IN 47405, U.S.A. 1 2 KAI Software Laboratory, Intel Corporation, 1906 Fox Drive, Champaign, IL 61820, U.S.A. 3 | ||||||||
Keyword(s) | Common Language Infrastructure (CLI), .NET, C#, Message Passing Interface (MPI), | ||||||||
Abstract |
|||||||||
We describe two different libraries for using the Message Passing Interface (MPI) with the C# programming language and the Common Language Infrastructure (CLI). The first library provides C# bindings that closely match the original MPI library specification. The second library presents a fully object-oriented interface to MPI and exploits modern language features of C#. The interfaces described here use the P/Invoke feature of the CLI to dispatch to a native implementation of MPI, such as LAM/MPI or MPICH. Performance results using the Shared Source CLI demonstrate only a small performance overhead. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | inAspect: interfacing Java and VSIPL applications | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.862 | ||||||||
Article ID | CPE862 | ||||||||
Author Name(s) | Torey Alford1Vijay P. Shah2Anthony Skjellum3Nicholas H. Younan4 | ||||||||
Author Email(s) | tony@mpi-softtech.com1 | ||||||||
Affiliation(s) | MPI Software Technology, Inc., 101 S. Lafayette St., Suite 33, Starkville, MS 39759, U.S.A. 1Department of Electrical and Computer Engineering, Mississippi State University, P.O. Box 9571, Mississippi State, MS 39762, U.S.A. 2 3 Department of Electrical and Computer Engineering, Mississippi State University, P.O. Box 9571, Mississippi State, MS 39762, U.S.A 4 | ||||||||
Keyword(s) | high-performance signal processing, embedded Java, fixed-point arithmetic, Java Native Interface, VSIPL, CORDIC algorithms, | ||||||||
Abstract |
|||||||||
In this paper, we discuss the origin, design, performance, and directions of the inAspect high-performance signal- and image-processing package for Java. The Vector Signal and Image Processing Library (VSIPL) community provides a standardized application programmer interface (API) for high-performance signal and image processing plus linear algebra with a C emphasis and object-based design framework. Java programmers need high-performance and/or portable APIs for this broad base of functionality as well. inAspect addresses PDAs, embedded Java boards, workstations, and servers, with emphasis on embedded systems at present. Efforts include supporting integer precisions and utilizing coordinate rotation digital computer (CORDIC) algorithms - both aimed at added relevance for limited-performance environments, such as present-day PDAs.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Towards enabling peer-to-peer Grids | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.863 | ||||||||
Article ID | CPE863 | ||||||||
Author Name(s) | Shrideep Pallickara1Geoffrey Fox2Xi Rao3 | ||||||||
Author Email(s) | spallick@indiana.edu1 | ||||||||
Affiliation(s) | Community Grid Computing Laboratory, Indiana University, Suite 224, 501 North Morton Street, IN 47404, U.S.A. 1 2 3 | ||||||||
Keyword(s) | event distribution systems, middleware, P2P systems, JXTA, Grid computing, | ||||||||
Abstract |
|||||||||
In this paper we propose a peer-to-peer (P2P) Grid comprising resources such as relatively static clients, high-end resources and a dynamic collection of multiple P2P subsystems. We investigate the architecture of the messaging and event service that will support such a hybrid environment. We designed a distributed publish-subscribe system NaradaBrokering for XML-specified messages. NaradaBrokering provides support for centralized, distributed and P2P (via JXTA) interactions. Here we investigate and present our strategy for the integration of JXTA into NaradaBrokering. The resultant system naturally scales with multiple Peer Groups linked by NaradaBrokering.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Generic programming for high-performance scientific applications | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 7-8 | ||||||||
Date | 06 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.864 | ||||||||
Article ID | CPE864 | ||||||||
Author Name(s) | Lie-Quan Lee1Andrew Lumsdaine2 | ||||||||
Author Email(s) | lums@osl.iu.edu2 | ||||||||
Affiliation(s) | Open Systems Laboratory, Pervasive Technology Laboratories, Indiana University, Bloomington, IN 47405, U.S.A. 1 2 | ||||||||
Keyword(s) | C++, generic programming, high-performance computing, iterative solvers, Krylov subspace, message passing, templates, | ||||||||
Abstract |
|||||||||
We present case studies that apply generic programming to the development of high-performance parallel code for solving two archetypal partial differential equations (PDEs). We examine the overall structure of the example scientific codes and consider their generic implementation. With a generic approach it is a straightforward matter to reuse software components from different sources; implementations with components from the Iterative Template Library (ITL), the Matrix Template Library (MTL), Blitz++, A++/P++, and Fortran BLAS are presented. Our newly developed Generic Message Passing library is used for communication. We compare the generic implementations with equivalent implementations developed with alternative libraries and languages and discuss performance as well as software engineering issues. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Concurrent urban legends | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 9 | ||||||||
Date | 08 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.885 | ||||||||
Article ID | CPE885 | ||||||||
Author Name(s) | Peter A. Buhr1Ashif S. Harji2 | ||||||||
Author Email(s) | pabuhr@uwaterloo.ca1 | ||||||||
Affiliation(s) | University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1 1 2 | ||||||||
Keyword(s) | concurrent, parallel, coroutine, synchronization, mutual exclusion, Dekker, Peterson, inheritance anomaly, signalling, spurious wakeup, | ||||||||
Abstract |
|||||||||
This discussion addresses a number of urban legends about concurrency in an attempt to separate the myth from the fact. These legends are as follows: 1 concurrent = parallel; 2 coroutining = concurrency; 3 synchronization = mutual exclusion; 4 Dekker < Peterson; 5 concurrency = library; 6 inheritance anomaly = major concurrency problem; 7 signalling = hints; 8 spurious wakeup = efficiency. Identifying and understanding the fundamental concepts underlying concurrency is essential to the field. Equally important is not to confuse sequential and concurrent concepts. Finally, approaches based solely on efficiency are insufficient to justify a weak or difficult to use concurrent concept or construct.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Pipelines on heterogeneous systems: models and tools | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 9 | ||||||||
Date | 08 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.886 | ||||||||
Article ID | CPE886 | ||||||||
Author Name(s) | F. Almeida1D. Gonzalez2M. Moreno3C. Rodriguez4 | ||||||||
Author Email(s) | falmeida@ull.es1 | ||||||||
Affiliation(s) | Dpto. Estad?stica, I. O. y Computaci?n, Edificio F?sicas/Matem?ticas, Universidad de La Laguna, La Laguna, Tenerife, Spain 1 2 3 4 | ||||||||
Keyword(s) | heterogeneous systems, pipelines, analytical models, | ||||||||
Abstract |
|||||||||
We study the performance of pipeline algorithms in heterogeneous networks. The concept of heterogeneity is not only restricted to the differences in computational power of the nodes, but also refers to the network capabilities. We develop a skeleton tool that allows us an efficient block-cyclic mapping of pipelines on heterogeneous systems. The tool supports pipelines with a number of stages much larger than the number of physical processors available. We derive an analytical formula that allows us to predict the performance of pipelines in heterogeneous systems. According to the analytical complexity formula, numerical strategies to solve the optimal mapping problem are proposed. The computational results prove the accuracy of the predictions and effectiveness of the approach.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Distributed computing with Triana on the Grid | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 9 | ||||||||
Date | 08 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.901 | ||||||||
Article ID | CPE901 | ||||||||
Author Name(s) | Ian Taylor1Ian Wang2Matthew Shields3Shalil Majithia4 | ||||||||
Author Email(s) | i.j.taylor@cs.cardiff.ac.uk1 | ||||||||
Affiliation(s) | School of Computer Science, Cardiff University, P.O. Box 916, Cardiff CF24 3XF, U.K. 1School of Physics, Astronomy and Computer Science, Queens Buildings, Cardiff University, 5 The Parade, Cardiff CF24 3YB, U.K. 2 3 4 | ||||||||
Keyword(s) | Triana, distributed systems, Grid, workflow, peer-to-peer, | ||||||||
Abstract |
|||||||||
In this paper, we describe Triana, a distributed problem-solving environment that makes use of the Grid to enable a user to compose applications from a set of components, select resources on which the composed application can be distributed and then execute the application on those resources. We describe Triana's current pluggable architecture that can support many different modes of operation by the use of flexible writers for many popular Web service choreography languages. We further show, that the Triana architecture is middleware-independent through the use of the Grid Application Toolkit (GAT) API and demonstrate this through the use of a GAT binding to JXTA. We describe how other bindings being developed to Grid infrastructures, such as OGSA, can seamlessly be integrated within the current prototype by using the switching capability of the GAT. Finally, we outline an experiment we conducted using this prototype and discuss its current status.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Editorial | ||||||||
Article Title | Special Issue: The High Performance Architectural Challenge: Mass Market versus Proprietary Components? This article is a U.S. Government work and is in the public domain in the U.S.A. | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.890 | ||||||||
Article ID | CPE890 | ||||||||
Author Name(s) | David B. Nelson1 | ||||||||
Author Email(s) | a@a.com1 | ||||||||
Affiliation(s) | National Coordination Office for Information Technology Research and Development, Arlington, VA 22230, U.S.A. 1 | ||||||||
Keyword(s) | ., | ||||||||
Abstract |
|||||||||
No abstract | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | This article is a U.S. Government work and is in the public domain in the U.S.A. | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.891 | ||||||||
Article ID | CPE891 | ||||||||
Author Name(s) | Darren J. Kerbyson1Adolfy Hoisie2Harvey J. Wasserman3 | ||||||||
Author Email(s) | djk@lanl.gov1 | ||||||||
Affiliation(s) | Performance and Architectures Laboratory (PAL), Modeling, Algorithms and Informatics Group, CCS-3, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A. 1 2 3 | ||||||||
Keyword(s) | performance modeling, performance analysis, performance prediction, large-scale systems, application analysis, extreme-scale parallel systems, | ||||||||
Abstract |
|||||||||
This work gives a detailed analysis of the relative performance of the recently installed Earth Simulator and the next top four systems in the Top500 list using predictive performance models. The Earth Simulator uses vector processing nodes interconnected using a single-stage, cross-bar network, whereas the next top four systems are built using commodity based superscalar microprocessors and interconnection networks. The performance that can be achieved results from an interplay of system characteristics, application requirements and scalability behavior. Detailed performance models are used here to predict the performance of two codes representative of the ASCI workload, namely SAGE and Sweep3D. The performance models encapsulate fully the behavior of these codes and have been previously validated on many large-scale systems. One result of this analysis is to size systems, built from the same nodes and networks as those in the top five, that will have the same performance as the Earth Simulator. In particular, the largest ASCI machine, ASCI Q, is expected to achieve a similar performance to the Earth Simulator on the representative workload. Published in 2005 by John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | This article is a U.S. Government work and is in the public domain in the U.S.A. | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.892 | ||||||||
Article ID | CPE892 | ||||||||
Author Name(s) | Jeffrey S. Vetter1Bronis R. de Supinski2Lynn Kissel3John May4Sheila Vaidya5 | ||||||||
Author Email(s) | vetter@computer.org1 | ||||||||
Affiliation(s) | Oak Ridge National Laboratory, Oak Ridge, TN 37922, U.S.A. 1Lawrence Livermore National Laboratory, Livermore, CA 94551, U.S.A. 2 3 4 5 | ||||||||
Keyword(s) | high-performance computing, performance evaluation, computer architecture, parallel and distributed computing, | ||||||||
Abstract |
|||||||||
Comparisons of high-performance computers based on their peak floating point performance are common but seldom useful when comparing performance on real workloads. Factors that influence sustained performance extend beyond a system's floating-point units, and real applications exercise machines in complex and diverse ways. Even when it is possible to compare systems based on their performance, other considerations affect which machine is best for a given organization. These include the cost, the facilities requirements (power, floorspace, etc.), the programming model, the existing code base, and so on. This paper describes some of the important measures for evaluating high-performance computers. We present data for many of these metrics based on our experience at Lawrence Livermore National Laboratory (LLNL), and we compare them with published information on the Earth Simulator. We argue that evaluating systems involves far more than comparing benchmarks and acquisition costs. We show that evaluating systems often involves complex choices among a variety of factors that influence the value of a supercomputer to an organization, and that the high-end computing community should view cost/performance comparisons of different architectures with skepticism. Published in 2005 by John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Architectural specification for massively parallel computers: an experience and measurement-based approach. | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.893 | ||||||||
Article ID | CPE893 | ||||||||
Author Name(s) | Ron Brightwell1William Camp2Benjamin Cole3Erik DeBenedictis4Robert Leland5James Tomkins6Arthur B. Maccabe7 | ||||||||
Author Email(s) | rbbrigh@sandia.gov1 | ||||||||
Affiliation(s) | Sandia National Laboratories, Scalable Computer Systems, P.O. Box 5800, Albuquerque, NM 87185-1110, U.S.A. 1 2 3 4 5 6 7 | ||||||||
Keyword(s) | massively parallel computing, supercomputing, commodity processors, vector processors, Amdahl, shared memory, distributed memory, | ||||||||
Abstract |
|||||||||
In this paper, we describe the hardware and software architecture of the Red Storm system developed at Sandia National Laboratories. We discuss the evolution of this architecture and provide reasons for the different choices that have been made. We contrast our approach of leveraging high-volume, mass-market commodity processors to that taken for the Earth Simulator. We present a comparison of benchmarks and application performance that support our approach. We also project the performance of Red Storm and the Earth Simulator. This projection indicates that the Red Storm architecture is a much more cost-effective approach to massively parallel computing. Published in 2005 by John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Practical performance portability in the Parallel Ocean Program (POP).This article is a U.S. Government work and is in the public domain in the U.S.A. | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.894 | ||||||||
Article ID | CPE894 | ||||||||
Author Name(s) | P. W. Jones1P. H. Worley2Y. Yoshida3J. B. White III4J. Levesque5 | ||||||||
Author Email(s) | pwjones@lanl.gov1 | ||||||||
Affiliation(s) | Theoretical Division, Los Alamos National Laboratory, T-3, MS B216, Los Alamos, NM 87545-1663, U.S.A. 1Computer Science and Mathematics Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831-6367, U.S.A. 2Central Research Institute of Electric Power Industry, 1646 Abiko Abiko-shi Chiba, 270-1194, Japan 3 4 Cray Inc., 411 First Avenue South, Suite 600, Seattle, WA 98104-2860, U.S.A. 5 | ||||||||
Keyword(s) | ocean models, portability, performance, vector, scalar, | ||||||||
Abstract |
|||||||||
The design of the Parallel Ocean Program (POP) is described with an emphasis on portability. Performance of POP is presented on a wide variety of computational architectures, including vector architectures and commodity clusters. Analysis of POP performance across machines is used to characterize performance and identify improvements while maintaining portability. A new design of the POP model, including a cache blocking and land point elimination scheme, is described with some preliminary performance results. Published in 2005 by John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | HPCx: towards capability computing | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 10 | ||||||||
Date | 08 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.895 | ||||||||
Article ID | CPE895 | ||||||||
Author Name(s) | Mike Ashworth1Ian J. Bush2Martyn F. Guest3Andrew G. Sunderland4Stephen Booth5Joachim Hein6Lorna Smith7Kevin Stratford8Alessandro Curioni9 | ||||||||
Author Email(s) | m.ashworth@dl.ac.uk1 | ||||||||
Affiliation(s) | Computational Science and Engineering Department, CLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire WA4 4AD, U.K. 1 2 3 4 Edinburgh Parallel Computing Centre, University of Edinburgh, JCMB, The King's Buildings, Mayfield Road, Edinburgh EH9 3JZ, U.K. 5 6 7 8 IBM Research, Zurich Research Laboratory, 8803 Riuschlikon, Switzerland 9 | ||||||||
Keyword(s) | high-performance computing, capability computing, scalable algorithms, computational science, computational engineering, | ||||||||
Abstract |
|||||||||
We introduce HPCx-the U.K.'s new National HPC Service-which aims to deliver a world-class service for capability computing to the U.K. scientific community. HPCx is targeting an environment that will both result in world-leading science and address the challenges involved in scaling existing codes to the capability levels required. Close working relationships with scientific consortia and user groups throughout the research process will be a central feature of the service. A significant number of key user applications have already been ported to the system. We present initial benchmark results from this process and discuss the optimization of the codes and the performance levels achieved on HPCx in comparison with other systems. We find a range of performance with some algorithms scaling far better than others. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Editorial | ||||||||
Article Title | Special Issue: High-Performance Computing in Geosciences | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 11 | ||||||||
Date | 09 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.896 | ||||||||
Article ID | CPE896 | ||||||||
Author Name(s) | |||||||||
Author Email(s) | |||||||||
Affiliation(s) | |||||||||
Keyword(s) | |||||||||
Abstract |
|||||||||
No abstract | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Full waveform seismic inversion using a distributed system of computers | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 11 | ||||||||
Date | 09 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.897 | ||||||||
Article ID | CPE897 | ||||||||
Author Name(s) | Mrinal K. Sen1Indrajit G. Roy2Carlos Torres-Verdin3 | ||||||||
Author Email(s) | mrinal@ig.utexas.edu1 | ||||||||
Affiliation(s) | Institute for Geophysics, John A. and Katherine G. Jackson School of Geosciences, The University of Texas at Austin, Austin, TX 78759, U.S.A. 1 2 Department of Petroleum and Geo-systems Engineering, The University of Texas at Austin, Austin, TX 78759, U.S.A. 3 | ||||||||
Keyword(s) | seismic waveform, inversion, adaptive regularization, distributed systems, | ||||||||
Abstract |
|||||||||
The aim of seismic waveform inversion is to estimate the elastic properties of the Earth's subsurface layers from recordings of seismic waveform data. This is usually accomplished by using constrained optimization often based on very simplistic assumptions. Full waveform inversion uses a more accurate wave propagation model but is extremely difficult to use for routine analysis and interpretation. This is because computational difficulties arise due to: (1) strong nonlinearity of the inverse problem; (2) extreme ill-posedness; and (3) large dimensions of data and model spaces. We show that some of these difficulties can be overcome by using: (1) an improved forward problem solver and efficient technique to generate sensitivity matrix; (2) an iteration adaptive regularized truncated Gauss-Newton technique; (3) an efficient technique for matrix-matrix and matrix-vector multiplication; and (4) a parallel programming implementation with a distributed system of processors. We use a message-passing interface in the parallel programming environment. We present inversion results for synthetic and field data, and a performance analysis of our parallel implementation. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | A simulation and data analysis system for large-scale, data-driven oil reservoir simulation studies | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 11 | ||||||||
Date | 09 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.898 | ||||||||
Article ID | CPE898 | ||||||||
Author Name(s) | Tahsin Kurc1Umit Catalyurek2Xi Zhang3Joel Saltz4Ryan Martino5Mary Wheeler6Małgorzata Peszyńska7Alan Sussman8Christian Hansen9Mrinal Sen10Roustam Seifoullaev11Paul Stoffa12Carlos Torres-Verdin13Manish Parashar14 | ||||||||
Author Email(s) | kurc@bmi.osu.edu1 | ||||||||
Affiliation(s) | Biomedical Informatics Department, The Ohio State University, 3184 Graves Hall, 333 West 10th Avenue, Columbus, OH 43210, U.S.A. 1 2 3 4 Center for Subsurface Modeling, Institute for Computational Engineering and Sciences, The University of Texas at Austin, 201 East 24th Street, ACE 5.324, Campus Mail C0200, Austin, TX 78712, U.S.A. 5 6 Department of Mathematics, Kidder Hall 368, Oregon State University, Corvallis, OR 97331-4605, U.S.A. 7Department of Computer Science, University of Maryland, A.V. Williams Building, College Park, MD 20742, U.S.A. 8 9 Institute for Geophysics, The University of Texas at Austin, 4412 Spicewood Springs Road, Building 600, Austin, TX 78758-8500, U.S.A. 10 11 12 Department of Petroleum and Geosystems Engineering, The University of Texas at Austin, 1 University Station, CO300, Austin, TX 78712-0228, U.S.A. 13Department of Electrical and Computer Engineering, Rutgers, State University of New Jersey, 94 Brett Road, Piscataway, NJ 08854-8058, U.S.A. 14 | ||||||||
Keyword(s) | oil reservoir simulation, Grid computing, data intensive computing, data management, | ||||||||
Abstract |
|||||||||
The main goal of oil reservoir management is to provide more efficient, cost-effective and environmentally safer production of oil from reservoirs. Numerical simulations can aid in the design and implementation of optimal production strategies. However, traditional simulation-based approaches to optimizing reservoir management are rapidly overwhelmed by data volume when large numbers of realizations are sought using detailed geologic descriptions. In this paper, we describe a software architecture to facilitate large-scale simulation studies, involving ensembles of long-running simulations and analysis of vast volumes of output data. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Enabling interactive and collaborative oil reservoir simulations on the Grid | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 11 | ||||||||
Date | 09 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.899 | ||||||||
Article ID | CPE899 | ||||||||
Author Name(s) | Manish Jack 4, Mary Wheeler Parashar1Rajeev Muralidhar2Wonsuck Lee3Dorian Arnold4Jack Dongarra5Mary Wheeler6 | ||||||||
Author Email(s) | arashar@caip.rutgers.edu1 | ||||||||
Affiliation(s) | The Applied Software Systems Laboratory, Rutgers University, Piscataway, NJ 08854, U.S.A. 1 2 Computing Science Research, Bell Laboratories, Murray Hill, NJ 07974, U.S.A. 3Computer Science Department, University of Wisconsin, Madison, WI 53706, U.S.A. 4Computer Science Department, University of Tennessee at Knoxville, Knoxville, TN 37996, U.S.A. 5Center for Subsurface Modeling, University of Texas at Austin, Austin, TX 78712, U.S.A. 6 | ||||||||
Keyword(s) | oil reservoir simulation, Grid-based computational collaboratory, interactive monitoring and steering, IPARS;, Discover, NetSolve, | ||||||||
Abstract |
|||||||||
Grid-enabled infrastructures and problem-solving environments can significantly increase the scale, cost-effectiveness and utility of scientific simulations, enabling highly accurate simulations that provide in-depth insight into complex phenomena. This paper presents a prototype of such an environment, i.e. an interactive and collaborative problem-solving environment for the formulation, development, deployment and management of oil reservoir and environmental flow simulations in computational Grid environments. The project builds on three independent research efforts: (1) the IPARS oil reservoir and environmental flow simulation framework; (2) the NetSolve Grid engine; and (3) the Discover Grid-based computational collaboratory. Its primary objective is to demonstrate the advantages of an integrated simulation infrastructure towards effectively supporting scientific investigation on the Grid, and to investigate the components and capabilities of such an infrastructure. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Large-scale density-driven flow simulations using parallel unstructured Grid adaptation and local multigrid methods | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 11 | ||||||||
Date | 09 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.900 | ||||||||
Article ID | CPE900 | ||||||||
Author Name(s) | Stefan Lang1Gabriel Wittum2 | ||||||||
Author Email(s) | stefan.lang@iwr.uni-heidelberg.de1 | ||||||||
Affiliation(s) | Technical Simulation, Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, INF 368 II, 69120 Heidelberg, Germany 1 2 | ||||||||
Keyword(s) | density-driven flow, parallel computation, multigrid methods, local Grid adaption, dynamic load migration and balancing, | ||||||||
Abstract |
|||||||||
h h h | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Editorial | ||||||||
Article Title | Special Issue: Foundations of Middleware Technologies | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 12 | ||||||||
Date | 10 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.878 | ||||||||
Article ID | CPE878 | ||||||||
Author Name(s) | Priya Narasimhan1 | ||||||||
Author Email(s) | |||||||||
Affiliation(s) | Carnegie Mellon University 1 | ||||||||
Keyword(s) | ., | ||||||||
Abstract |
|||||||||
No abstract | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | null | ||||||||
Article Title | On the modelling of publish/subscribe communication systems | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 12 | ||||||||
Date | 10 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.879 | ||||||||
Article ID | CPE879 | ||||||||
Author Name(s) | R. Baldoni1R. Beraldi2S. Tucci Piergiovanni3A. Virgillito4 | ||||||||
Author Email(s) | baldoni@dis.uniroma1.it1 | ||||||||
Affiliation(s) | Dipartimento di Informatica e Sistemistica, Universit? di Roma La Sapienza, Via Salaria 113, 00198 Roma, Italy 1 2 3 4 | ||||||||
Keyword(s) | publish/subscribe, event-based middleware, | ||||||||
Abstract |
|||||||||
This paper presents a formal framework of a distributed computation based on a publish/subscribe system. The framework abstracts the system through two delays, namely the subscription/unsubscription delay and the diffusion delay. This abstraction allows one to model concurrent execution of publication and subscription operations without waiting for the stability of the system state and to define a Liveness property which gives the conditions for the presence of a notification event in the global history of the system. This formal framework allows us to analytically define a measure of the effectiveness of a publish/subscribe system, which reflects the percentage of notifications guaranteed by the system to subscribers. A simulation study confirms the validity of the analytical measurements. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Distributed filter processes | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 12 | ||||||||
Date | 10 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.880 | ||||||||
Article ID | CPE880 | ||||||||
Author Name(s) | Rushikesh K. Joshi1 | ||||||||
Author Email(s) | rkj@cse.iitb.ac.in1 | ||||||||
Affiliation(s) | Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai 400 076, India 1 | ||||||||
Keyword(s) | distributed processes, dynamic evolution, filter processes, cooperative filters, transparency, | ||||||||
Abstract |
|||||||||
A programming model called distributed filter processes for transparent filtering in distributed systems is presented. Filter processes are modeled as extensions to distributed processes. Messages can be captured transparently by intermediate filter processes. The programming abstractions are presented with applications to the evolution of programs.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Benchmarking message-oriented middleware: TIB/RV versus SonicMQ | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 12 | ||||||||
Date | 10 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.881 | ||||||||
Article ID | CPE881 | ||||||||
Author Name(s) | Piyush Maheshwari1Michael Pang2 | ||||||||
Author Email(s) | piyush@cse.unsw.edu.au1 | ||||||||
Affiliation(s) | School of Computer Science and Engineering, The University of New South Wales, Sydney, NSW 2052, Australia 1SAP Australia Pty. Ltd, 168 Walker Street, North Sydney, NSW 2061, Australia 2 | ||||||||
Keyword(s) | middleware, message-oriented middleware, benchmarking, performance evaluation, publish/subscribe messaging, point-to-point messaging, | ||||||||
Abstract |
|||||||||
Message-oriented middleware (MOM) has become a vital part of the complex application integration projects. MOM is used to pass data and workflow in the form of messages between different enterprise applications. The performance of integrated applications greatly depends on how effectively the MOM performs. This paper presents a benchmark comparison between two industry well-known MOMs-TIBCO Rendezvous (TIB/RV) and SonicMQ. Although the two MOMs are very similar in certain respects, their native implementation and architecture are very different. We provide an unbiased benchmark reference to the middleware selection process. The primary objective of our work is to evaluate and compare the MOMs by testing their effectiveness in the delivery of messages in publish/subscribe and point-to-point message domains, their program stability and the system resource utilization. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | MEAD: support for Real-Time Fault-Tolerant CORBA | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 12 | ||||||||
Date | 10 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.882 | ||||||||
Article ID | CPE882 | ||||||||
Author Name(s) | P. Narasimhan1T. A Dumitra2A. M. Paulos3S. M. Pertet4C. F. Reverte5J. G Slember6D. Srivastava7 | ||||||||
Author Email(s) | priya@cs.cmu.edu1 | ||||||||
Affiliation(s) | Electrical and Computer Engineering Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213-3890, U.S.A. 1 2 3 4 5 6 7 | ||||||||
Keyword(s) | real-time, fault tolerance, trade-offs, non-determinism, CORBA, predictability, recovery, | ||||||||
Abstract |
|||||||||
The OMG's Real-Time CORBA (RT-CORBA) and Fault-Tolerant CORBA (FT-CORBA) specifications make it possible for today's CORBA implementations to exhibit either real-time or fault tolerance in isolation. While real-time requires a priori knowledge of the system's temporal operation, fault tolerance necessarily deals with faults that occur unexpectedly, and with possibly unpredictable fault recovery times. The MEAD (Middleware for Embedded Adaptive Dependability) system attempts to identify and to reconcile the conflicts between real-time and fault tolerance, in a resource-aware manner, for distributed CORBA applications. MEAD supports transparent yet tunable fault tolerance in real-time, proactive dependability, resource-aware system adaptation to crash, communication and timing faults with bounded fault detection and fault recovery.Copyright ? 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Lightweight monitoring of MPI programs in real time | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 13 | ||||||||
Date | 11 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.889 | ||||||||
Article ID | CPE889 | ||||||||
Author Name(s) | German Florez1Zhen Liu2Susan M. Bridges3Anthony Skjellum4Rayford B. Vaughn5 | ||||||||
Author Email(s) | gf24@cse.msstate.edu1 | ||||||||
Affiliation(s) | Center for Computer Security Research and High Performance Computer Laboratory, Department of Computer Science and Engineering, Mississippi State University, MS 39762-9637, U.S.A. 1 2 3 4 5 | ||||||||
Keyword(s) | anomaly detection, message-passing interface, cluster monitoring, interposition library, loadable kernel module, hidden Markov model, artificial neural networks, | ||||||||
Abstract |
|||||||||
Current technologies allow efficient data collection by several sensors to determine an overall evaluation of the status of a cluster. However, no previous work of which we are aware analyzes the behavior of the parallel programs themselves in real time. In this paper, we perform a comparison of different artificial intelligence techniques that can be used to implement a lightweight monitoring and analysis system for parallel applications on a cluster of Linux workstations. We study the accuracy and performance of deterministic and stochastic algorithms when we observe the flow of both library-function and operating-system calls of parallel programs written with C and MPI. We demonstrate that monitoring of MPI programs can be achieved with high accuracy and in some cases with a false-positive rate near 0% in real time, and we show that the added computational load on each node is small. As an example, the monitoring of function calls using a hidden Markov model generates less than 5% overhead. The proposed system is able to automatically detect deviations of a process from its expected behavior in any node of the cluster, and thus it can be used as an anomaly detector, for performance monitoring to complement other systems or as a debugging tool. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | A static mapping heuristics to map parallel applications to heterogeneous computing systems | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 13 | ||||||||
Date | 11 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.902 | ||||||||
Article ID | CPE902 | ||||||||
Author Name(s) | Ranieri Baraglia1Renato Ferrini2Pierluigi Ritrovato3 | ||||||||
Author Email(s) | ranieri.baraglia@cnuce.cnr.it1 | ||||||||
Affiliation(s) | High Performance Computing Laboratory, Information Science and Technologies Institute (ISTI), Italian National Research Council (CNR), Via G. Moruzzi, 1, 56126 Pisa, Italy 1 2 Centro di Ricerca in Matematica Pura ed Applicata, Universitá degli Studi di Salerno, Fisciano (SA), Italy 3 | ||||||||
Keyword(s) | parallel processing, static mapping, task mapping, performance evaluation, | ||||||||
Abstract |
|||||||||
In order to minimize the execution time of a parallel application running on a heterogeneously distributed computing system, an appropriate mapping scheme is needed to allocate the application tasks to the processors. The general problem of mapping tasks to machines is a well-known NP-hard problem and several heuristics have been proposed to approximate its optimal solution. In this paper we propose a static graph-based mapping algorithm, called Heterogeneous Multi-phase Mapping (HMM), which permits suboptimal mapping of a parallel application onto a heterogeneous computing distributed system by using a local search technique together with a tabu search meta-heuristic. HMM allocates parallel tasks by exploiting the information embedded in the parallelism forms used to implement an application, and considering an affinity parameter, that identifies which machine in the heterogeneous computing system is most suitable to execute a task. We compare HMM with some leading techniques and with an exhaustive mapping algorithm. We also give an example of mapping of two real applications using HMM. Experimental results show that HMM performs well demonstrating the applicability of our approach. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | GridBLAST: a Globus-based high-throughput implementation of BLAST in a Grid computing framework | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 13 | ||||||||
Date | 11 00 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.906 | ||||||||
Article ID | CPE906 | ||||||||
Author Name(s) | Arun Krishnan1 | ||||||||
Author Email(s) | arun@bii.a-star.edu.sg1 | ||||||||
Affiliation(s) | BioInformatics Institute, 30 Biopolis Street,#07-01, Singapore 138671 1 | ||||||||
Keyword(s) | Grid application, high-throughput BLAST, | ||||||||
Abstract |
|||||||||
Improvements in the performance of processors and networks have made it feasible to treat collections of workstations, servers, clusters and supercomputers as integrated computing resources or Grids. However, the very heterogeneity that is the strength of computational and data Grids can also make application development for such an environment extremely difficult. Application development in a Grid computing environment faces significant challenges in the form of problem granularity, latency and bandwidth issues as well as job scheduling. Currently existing Grid technologies limit the development of Grid applications to certain classes, namely, embarrassingly parallel, hierarchical parallelism, work flow and database applications. Of all these classes, embarrassingly parallel applications are the easiest to develop in a Grid computing framework. The work presented here deals with creating a Grid-enabled, high-throughput, standalone version of a bioinformatics application, BLAST, using Globus as the Grid middleware. BLAST is a sequence alignment and search technique that is embarrassingly parallel in nature and thus amenable to adaptation to a Grid environment. A detailed methodology for creating the Grid-enabled application is presented, which can be used as a template for the development of similar applications. The application has been tested on a ‘mini-Grid’ testbed and the results presented here show that for large problem sizes, a distributed, Grid-enabled version can help in significantly reducing execution times. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Editorial | ||||||||
Article Title | Special Issue: Third IEEE International Workshop on High Performance Computational Biology (HiCOMB 2004) | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.948 | ||||||||
Article ID | CPE948 | ||||||||
Author Name(s) | Dan Marinescu1 | ||||||||
Author Email(s) | a@a.com1 | ||||||||
Affiliation(s) | 1 | ||||||||
Keyword(s) | ., | ||||||||
Abstract |
|||||||||
No abstract | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Study of a highly accurate and fast protein-ligand docking method based on molecular dynamics | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.949 | ||||||||
Article ID | CPE949 | ||||||||
Author Name(s) | C. L. Brooks III1M. Taufer2M. Crowley3D. J. Price4A. A. Chien5 | ||||||||
Author Email(s) | brooks@scripps.edu1 | ||||||||
Affiliation(s) | Department of Molecular Biology (TPC6), The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, U.S.A. 1 2 3 4 5 | ||||||||
Keyword(s) | force-field-based methods, docking accuracy, desktop Grid computing, | ||||||||
Abstract |
|||||||||
Few methods use molecular dynamics simulations in concert with atomically detailed force fields to perform protein-ligand docking calculations because they are considered too time demanding, despite their accuracy. In this paper we present a docking algorithm based on molecular dynamics which has a highly flexible computational granularity. We compare the accuracy and the time required with well-known, commonly used docking methods such as AutoDock, DOCK, FlexX, ICM, and GOLD. We show that our algorithm is accurate, fast and, because of its flexibility, applicable even to loosely coupled distributed systems such as desktop Grids for docking. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Parallel protein folding with STAPL | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.950 | ||||||||
Article ID | CPE950 | ||||||||
Author Name(s) | Nancy M. Amato1Shawna Thomas2Gabriel Tanase3Lucia K. Dale4Jose M. Moreira5Lawrence Rauchwerger6 | ||||||||
Author Email(s) | amato@cs.tamu.edu1 | ||||||||
Affiliation(s) | Parasol Laboratory, Department of Computer Science, Texas A&M University, College Station, TX 77843-3112, U.S.A. 1 2 Parasol Laboratory, Department of Computer Science, Texas A&M University, College Station, TX 77843-3112, U.S.A 3University of the South, Sewanee, TN 37383-1000, U.S.A. 4IBM TJ Watson Research Center, Yorktown Heights, NY 10598-0218, U.S.A. 5 6 | ||||||||
Keyword(s) | protein folding, motion planning, parallel libraries, C++, | ||||||||
Abstract |
|||||||||
The protein-folding problem is a study of how a protein dynamically folds to its so-called native state - an energetically stable, three-dimensional conformation. Understanding this process is of great practical importance since some devastating diseases such as Alzheimer's and bovine spongiform encephalopathy (Mad Cow) are associated with the misfolding of proteins. We have developed a new computational technique for studying protein folding that is based on probabilistic roadmap methods for motion planning. Our technique yields an approximate map of a protein's potential energy landscape that contains thousands of feasible folding pathways. We have validated our method against known experimental results. Other simulation techniques, such as molecular dynamics or Monte Carlo methods, require many orders of magnitude more time to produce a single, partial trajectory. In this paper we report on our experiences parallelizing our method using STAPL (Standard Template Adaptive Parallel Library) that is being developed in the Parasol Lab at Texas A&M. An efficient parallel version will enable us to study larger proteins with increased accuracy. We demonstrate how STAPL enables portable efficiency across multiple platforms, ranging from small Linux clusters to massively parallel machines such as IBM's BlueGene/L, without user code modification.Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Dynamic programming for LR-PCR segmentation of bacterium genomes | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.951 | ||||||||
Article ID | CPE951 | ||||||||
Author Name(s) | Rumen Andonov1Dominique Lavenier2Philippe Veber3Lucia K. Dale4Nicola Yanev5 | ||||||||
Author Email(s) | randonov@irisa.fr1 | ||||||||
Affiliation(s) | IRISA, Campis de Beaulieu, 35042 Rennes Cedex, France 1 2 3 University of the South, Sewanee, TN 37383-1000, U.S.A. 4University of Sofia, 5 J. Bouchier str., 1126 Sofia, Bulgaria 5 | ||||||||
Keyword(s) | dynamic programming, LR-PCR, optimization, genomics, | ||||||||
Abstract |
|||||||||
Bacterium genome plasticity can efficiently be studied by the long-range polymerase chain reaction (LR-PCR) technique: genomes of different strains are split into hundreds of short segments which, after LR-PCR amplification, are used to sketch profiles. The segments have to: (1) cover the entire genome; (2) overlap each other; and (3) be of nearly identical size. This paper addresses the problem of finding a list of segments satisfying these constraints as much as possible. Two algorithms based on dynamic programming approach are presented. They differ on the optimization criteria for measuring the quality of the covering. The first considers the maximal deviation of the segment lengths relatively to an ideal length. The second automatically finds a segment length that minimizes the maximal deviation. .Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Parallel RNA secondary structure prediction using stochastic context-free grammars | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.952 | ||||||||
Article ID | CPE952 | ||||||||
Author Name(s) | Tong Liu1Bertil Schmidt2 | ||||||||
Author Email(s) | asbschmidt@ntu.edu.sg1 | ||||||||
Affiliation(s) | School of Computer Engineering, Nanyang Technological University, Singapore 639798 1 2 | ||||||||
Keyword(s) | RNA secondary structure, stochastic context-free grammars, parallel processing, | ||||||||
Abstract |
|||||||||
With the growing number of known RNA genes efficient and accurate computational analysis of RNA sequences is becoming increasingly important. Stochastic context-free grammars (SCFGs) are used as a popular tool to model RNA secondary structures. However, algorithms for aligning a RNA sequence to a SCFG are highly compute-intensive. This has so far limited applications of SCFGs to relatively small problem sizes. In this paper we present the design of a parallel RNA sequence-structure alignment algorithm. Its implementation on parallel systems leads to significant runtime savings. This makes it possible to compute sequence-structure alignments of even the largest RNAs such as small subunit ribosomal rRNAs and long subunit ribosomal rRNAs in reasonable time. .Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Peptide identification via constrained multi-objective optimization: Pareto-based genetic algorithms | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.953 | ||||||||
Article ID | CPE953 | ||||||||
Author Name(s) | J. M. Malard1A. Heredia-Langner2W. R. Cannon3R. Mooney4D. J. Baxter5 | ||||||||
Author Email(s) | jm.malard@pnl.gov1 | ||||||||
Affiliation(s) | Pacific Northwest National Laboratory, Richland, WA, U.S.A. 1 2 3 4 5 | ||||||||
Keyword(s) | data-intensive computation, genetic algorithms, multiobjective optimization, peptide identification, tandem mass spectrometry, | ||||||||
Abstract |
|||||||||
Automatic peptide identification from collision-induced dissociation tandem mass spectrometry data using optimization techniques is made difficult by large plateaus in the fitness landscapes of scoring functions, by the fuzzy nature of constraints from noisy data and by the existence of diverse but equally justifiable probabilistic models of peak matching. Here, two different scoring functions are combined into a parallel multi-objective optimization framework. It is shown how multi-objective optimization can be used to empirically test for independence between distinct scoring functions. The loss of selection pressure during the evolution of a population of putative peptide sequences by a Pareto-driven genetic algorithm is addressed by alternating between two definitions of fitness according to a numerical threshold. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | RAxML-II: a program for sequential, parallel and distributed inference of large phylogenetic trees | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 14 | ||||||||
Date | 12 10 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.954 | ||||||||
Article ID | CPE954 | ||||||||
Author Name(s) | Alexandros Stamatakis1Thomas Ludwig2Harald Meier3 | ||||||||
Author Email(s) | stamatak@cs.tum.edu1 | ||||||||
Affiliation(s) | Technische Universität München, Lehrstuhl für Rechnertechnik und Rechnerorganisation/I10, Boltzmannstrasse 3, D-85748 Garching b. München, Germany 1Ruprecht-Karls-Universität Heidelberg, Institut für Informatik, Im Neuenheimer Feld 348, D-69120 Heidelberg, Germany 2 3 | ||||||||
Keyword(s) | phylogenetic trees, maximum likelihood, parallel and distributed computing, | ||||||||
Abstract |
|||||||||
Inference of phylogenetic trees comprising hundreds or even thousands of organisms based on the maximum likelihood method is computationally intensive. We present simple heuristics which yield accurate trees for synthetic as well as real data and significantly reduce execution time. Those heuristics have been implemented in a sequential, parallel, and distributed program called RAxML-II, which is freely available as open source code. We compare the performance of the sequential program with PHYML and MrBayes which-to the best of our knowledge-are currently the fastest and most accurate programs for phylogenetic tree inference based on statistical methods. Experiments are conducted using 50 synthetic 100 taxon alignments as well as nine real-world alignments comprising 101 up to 1000 sequences. RAxML-II outperforms MrBayes for real-world data both in terms of speed and final likelihood values. Furthermore, for real data RAxML-II requires less time (a factor of 2-8) than PHYML to reach PHYML's final likelihood values and yields better final trees due to its more exhaustive search strategy. For synthetic data MrBayes is slightly more accurate than RAxML-II and PHYML but significantly slower. The non-deterministic parallel program shows good speedup values and has been used to infer a 10 000-taxon tree comprising organisms from the domains Eukarya, Bacteria, and Archaea. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Loosely coordinated coscheduling in the context of other approaches for dynamic job scheduling: a survey | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 15 | ||||||||
Date | 12 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.887 | ||||||||
Article ID | CPE887 | ||||||||
Author Name(s) | A. C. Sodan1 | ||||||||
Author Email(s) | acsodan@cs.uwindsor.ca1 | ||||||||
Affiliation(s) | University of Windsor, Computer Science, 401 Sunset Avenue, Windsor, Ontario, Canada N9B 3P4 1 | ||||||||
Keyword(s) | job scheduling, coscheduling, clusters, resource allocation, dynamic adaptation, computational Grids, | ||||||||
Abstract |
|||||||||
Loosely coordinated (implicit/dynamic) coscheduling is a time-sharing approach that originates from network of workstations environments of mixed parallel/serial workloads and limitedsoftware support. It is meant to be an easy-to-implement and scalable approach. Considering that the percentage of clusters in parallel computing is increasing and easily portable software is needed, loosely coordinated coscheduling becomes an attractive approach for dedicated machines. Loose coordination offers attractive features as a dynamic approach. Static approaches for local job scheduling assign resources exclusively and non-preemptively. Such approaches still remain beyond the desirable resource utilization and average response times. Conversely, approaches for dynamic scheduling of jobs can preempt resources and/or adapt their allocation. They typically provide better resource utilization and response times. Existing dynamic approaches are full preemption with checkpointing, dynamic adaptation of node/CPU allocation, and time sharing via gang or loosely coordinated coscheduling. This survey presents and compares the different approaches, while particularly focusing on the less well-explored loosely coordinated time sharing. The discussion particularly focuses on the implementation problems, in terms of modification of standard operating systems, the runtime system and the communication libraries. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | Neuroscience instrumentation and distributed analysis of brain activity data: a case for eScience on global Grids | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 15 | ||||||||
Date | 12 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.888 | ||||||||
Article ID | CPE888 | ||||||||
Author Name(s) | Rajkumar Buyya1Susumu Date2Yuko Mizuno-Matsumoto3Srikumar Venugopal4David Abramson5 | ||||||||
Author Email(s) | raj@cs.mu.oz.au1 | ||||||||
Affiliation(s) | Grid Computing and Distributed Systems (GRIDS) Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Australia 1Graduate School of Information Science and Technology, Department of Bioinformatics Engineering, Osaka University, Japan 2Department of Information Systems Engineering, Graduate School of Osaka University, Japan 3 4 School of Computer Science and Software Engineering, Monash University, Melbourne, Australia 5 | ||||||||
Keyword(s) | Grid computing, eScience, neuroscience, Gridbus middleware, Nimrod-G, virtual instrumentation, | ||||||||
Abstract |
|||||||||
distribution need | |||||||||
Publisher | John Wiley & Sons, Ltd. Chichester, UK | ||||||||
Category | Research Article | ||||||||
Article Title | http://nenya.ms.mff.cuni.cz/projects/corba/oopsla-workshop-03/ | ||||||||
Volume ID | 17 | ||||||||
Issue ID | 15 | ||||||||
Date | 12 25 2005 | ||||||||
DOI(URI) | http://dx.doi.org/10.1002/cpe.918 | ||||||||
Article ID | CPE918 | ||||||||
Author Name(s) | Paul Brebner1Emmanuel Cecchet2Julie Marguerite3Petr Tůma4Octavian Ciuhandu5Bruno Dufour6Lieven Eeckhout7Stéphane Frénot8Arvind S. Krishna9John Murphy10Clark Verbrugge11 | ||||||||
Author Email(s) | paul.brebner@csiro.au1 | ||||||||
Affiliation(s) | CSIRO ICT Centre, G.P.O. Box 664, Canberra, ACT 2601, Australia 1INRIA Rhône-Alpes/ObjectWeb, 655 avenue de l'Europe, 38330 Montbonnot, St. Martin, France 2INRIA Rhône-Alpes/ObjectWeb, 655 avenue de l'Europe, 38330 Montbonnot, St. Martin, France 3Department of Software Engineering, Charles University, Malostranské náměstí 25, Praha 1, Czech Republic 4Performance Engineering Laboratory, Dublin City University, Dublin 9, Ireland 5School of Computer Science, McGill University, Montréal, Québec, Canada H3A 2A7 6Department of Electronics and Information Systems (ELIS), Ghent University, St. Pietersnieuwstraat 41, B9000 Gent, Belgium 7INRIA Arès, INSA-CITI Bat. Léonard de Vinci, 69661 Villeurbanne Cedex, France 8Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37235, U.S.A. 9 10 11 | ||||||||
Keyword(s) | middleware benchmarking, middleware performance, middleware scalability, middleware evaluation, middleware benchmark design, benchmarking guidelines, benchmarking measurement, benchmarking metrics, OOPSLA 2003 workshop, | ||||||||
Abstract |
|||||||||
The report summarizes the results of the Workshop on Middleware Benchmarking held during OOPSLA 2003. The goal of the workshop was to help advance the current practice of gathering performance characteristics of middleware implementations through benchmarking. The participants of the workshop have focused on identifying requirements of and obstacles to middleware benchmarking and forming a position on the related issues. Selected requirements and obstacles are presented, together with guidelines to adhere to when benchmarking, open issues of current practice, and perspectives on further research. Copyright © 2005 John Wiley & Sons, Ltd. | |||||||||