Java Grande Special Issue of Concurrency:Practice and Experience
- C413: Locality optimization in JavaParty by means of static type
analysis
- Abstract: On clusters and DMPs, locality of objects and threads and
hence avoidance of network communication, are crucial for performance. We show
that an extension of known type inference mechanisms can be used to compute
placement decisions that improve locality. In addition to this general
contribution, the paper specifically addresses the problems that are caused by
the distributed Java environment. Since the JVM is assumed to be fixed, the
optimization is done as source- to-source transformation.
- Michael Philippsen and Bernhard Haumacher University of Karlsruhe, Germany
- phlipp@ira.uka.de and
hauma@ira.uka.de
- C413JGFSIphilip/static/static.pdf
- C414: More Efficient Serialization and RMI for Java
- Abstract:. In current Java implementations, Remote Method Invoca-
tion (RMI) is too slow, especially for high performance computing. RMI is
designed for wide-area and high-latency networks, it is based on a slow object
serialization, and it does not support high-performance commu- nication
networks. The paper demonstrates that a much faster drop-in RMI and an ecient
drop-in serialization can be designed and implemented completely in Java
without any native code. Moreover, the re-designed RMI supports non- TCP/IP
communication networks, even with heterogeneous transport protocols. We
demonstrate that for high performance computing some of the official
serialization's generality can and should be traded for speed. As a by-product,
a benchmark collection for RMI is presented. On PCs connected through Ethernet,
the better serialization and the improved RMI save a median of 45% (maximum of
71%) of the runtime for some set of arguments. On our Myrinet-based ParaStation
network (a cluster of DEC Alphas) we save a median of 85% (maximum of 96%),
compared to standard RMI, standard serialization, and Fast Ethernet; a remote
method invocation runs as fast as 80 mus round trip time, compared to about 1.5
ms.
- Michael Philippsen, Bernhard Haumacher, and Christian Nester
- Computer Science Department, University of Karlsruhe Am Fasanengarten 5,
76128 Karlsruhe, Germany
- [phlipp,hauma,nester]@ira.uka.de
- http://wwwipd.ira.uka.de/JavaParty/
- C414JGFSIphilip/serialrmi/serialrmi.pdf
- C423: Jaguar: Enabling Efficient Communication and I/O in Java
- Abstract: Implementing eÆcient communication and I/O
mechanisms in Java requires both fast access to low-level system resources
(such as network and raw disk interfaces) and direct manipulation of memory
regions external to the Java heap (such as communication and I/O buffers). Java
native methods are too expensive to perform these operations and raise serious
protection concerns. We present Jaguar, a new mechanism which provides Java
applications with efficient access to system resources while retaining the
protection of the Java environment. This is accomplished through compile-time
translation of certain Java bytecodes to inlined machine code segments. We
demonstrate the use of Jaguar through a Java interface to the VIA fast
communications layer, which achieves nearly identical performance to that of C,
and Pre-Serialized Objects, a mechanism which greatly reduces the cost of Java
object serialization.
- Matt Welsh and David Culler
- University of California, Berkeley
- mdw,culler@cs.berkeley.edu
- C423JGFSIWelsh/c423.pdf
- C424: Performance Measurement of Dynamically Compiled Java
Executions
- Abstract With the development of dynamic compilers for Java,
Javas performance promises to rival that of equivalent C/C++ binary
executions. This should ensure that Java will become the platform of choice for
ubiquitous Web-based supercomputing. Therefore, being able to build performance
tools for dynamically compiled Java executions will become increasingly
important. In this paper we discuss those aspects of dynamically compiled Java
executions that make performance measurement difficult: (1) some Java
application methods may be transformed from byte-code to native code at
run-time; and (2), even in native form, application code may interact with the
Java virtual machine. We describe Paradyn-J, an experimental version of the
Paradyn Parallel Performance Tool that addresses this environment by describing
performance data from dynamically compiled executions in terms of the multiple
execution forms (interpreted byte-code and directly executed native code) of a
method, costs of the dynamic compilation, and costs of residual dependencies of
the application on the virtual machine. We use performance data from Paradyn-J
to tune a Java application method, and improve its interpreted byte-code
execution by 11% and its native form execution by 10%. As a result of tuning
just one method, we improve the applications total execution time by 10%
when run under Suns ExactVM (included in the Platform2 release of JDK).
Results of our work are guide to virtual machine designers as what type of
performance data should be available through Java VM performance tool APIs.
- Tia Newhall and Barton P. Miller
- {newhall,bart}@cs.wisc.edu
- Computer Sciences Department University of Wisconsin Madison, WI 53706-1685
- C424JGFSINewhall/c424.pdf
- C426: The Gateway System: Uniform Web Based Access to Remote
Resources
- Abstract: Exploiting our experience developing the WebFlow system,
we designed the Gateway system to provide seamless and secure access to
computational resources at ASC MSRC. The Gateway follows our commodity
components strategy, and it is implemented as a modern three-tier system. Tier
1 is a high-level front end for visual programming, steering, run-time data
analysis and visualization that is built on top of the Web and OO commodity
standards. Distributed object-based, scalable, and reusable Web server and
Object broker middleware forms Tier 2. Back-end services comprise Tier 3. In
particular, access to high-performance computational resources is provided by
implementing the emerging standard for metacomputing API.
- Tomasz Haupt, Erol Akarsu, Geoffrey Fox, Alexey Kalinichenko, Kang-Seok
Kim, Praveen Sheethalnath, Choon-Han Youn
- Northeast Parallel Architecture Center at Syracuse University
- haupt@npac.syr.edu
- C426JGFSIhaupt/JavaGrande99.pdf
- C427: Object Serialization for Marshalling Data in a Java
Interface to MPI
- Abstract: Several Java bindings to Message Passing Interface (MPI)
software have been developed recently. Message buers have usually been
restricted to arrays with elements of primitive type. We discuss adoption of
the Java object serialization model for marshalling general communication data
in MPI-like APIs. This approach is compared with a Java transcription of the
standard MPI derived datatype mechanism. We describe an imple- mentation of the
mpiJava interface to MPI that incorporates automatic object serialization.
Benchmark results conrm that current JDK im- plementations of serialization
are not fast enough for high performance messaging applications. Means of
solving this problem are discussed, and benchmarks for greatly improved schemes
are presented.
- Bryan Carpenter, Geoffrey Fox, Sung Hoon Ko and Sang Lim
- NPAC at Syracuse University Syracuse, NY 13244
- {dbc,gcf,shko,slim}@npac.syr.edu
- C427JGFSIdbc/c427serialize.pdf
- C428: Wide-Area Parallel Programming using the Remote Method
Invocation Model
- Abstract: Javas support for parallel and distributed
processing makes the language attractive for metacomputing applications, such
as parallel applications that run on geographically distributed (wide-area)
systems. To obtain actual experience with a Java-centric approach to
metacomput-ing, we have built and used a high-performance wide-area Java
system, called Manta. Manta implements the Java Remote Method Invocation (RMI)
model using different communication protocols (active messages and TCP/IP) for
different networks. The paper shows how wide-area parallel applications can be
expressed and optimized using Java RMI. Also, it presents performance results
of several applications on a wide-area system consisting of four Myrinet-based
clusters connected by ATMWANs. We finally discuss alternative programming
models, namely object replication, JavaSpaces, and MPI for Java.
- Rob van Nieuwpoort, Jason Maassen, Henri E. Bal, Thilo Kielmann, Ronald
Veldema
- Department of Computer Science, Vrije Universiteit, Amsterdam, The
Netherlands.
- rob@cs.vu.nl, jason@cs.vu.nl, bal@cs.vu.nl, kielmann@cs.vu.nl,
rveldema@cs.vu.nl
- http://www.cs.vu.nl/albatross/
- C428JGFSIbal/c428.pdf
- C429: Performance Limitations of the Java Core Libraries
- Abstract: Unlike applets, traditional systems programs written in
Java place significant demands on the Java runtime and core libraries, and
their performance is often critically important. This paper describes our
experiences using Java to build such a systems program, namely, a
high-performance web crawler. We found that our runtime, which includes a
just-in-time compiler that compiles Java bytecodes to native machine code,
performed well. However, we encountered several performance problems with the
Java core libraries, including excessive synchronization, excessive allocation,
and other programming inefficiencies. The paper describes the most serious
pitfalls and how we programmed around them. In total, these workarounds more
than doubled the speed of our crawler.
- Allan Heydon, Marc Najork
- Compaq Computer Corporation, Systems Research Center, 130 Lytton Avenue,
Palo Alto, CA 94301, USA
- {heydon,najork}@pa.dec.com
- C429JGFSIheydon/c429.pdf
- C430:A Java/CORBA based Visual Program Composition Environment
for PSEs
- Abstract: A Problem Solving Environment (PSE) is a complete,
integrated com- puting environment for composing, compiling and running
applications in a specic problem area or domain. A Visual Programming
Composition Environment (VPCE) is described, which serves as a user interface
for a PSE, and uses Java and CORBA to provide a framework of tools to enable
the construction of scientic applications from components. The VPCE consists
of a component repository, from which the user can select o-the-shelf or
in-house components, a graphical composition area on which components can be
combined, various tools that facilitate the conguration of components, the
integration of legacy codes into com- ponents and the design and building of
new components. The VPCE produces output using data ow techniques in the form
of a task graph, annotated with a performance model plus constraints for each
component, expressed in XML. In addition, the VPCE supports a domain specic
expert system based on JESS [9] to guide the user in component selection and to
perform integrity checking.
- Matthew S. Shields, Omer F. Rana, David W. Walker, Maozhen Li
- Department of Computer Science, Cardiff University, POBox 916, Cardiff,
CF24 3XF, UK
- David Golby
- Department of Mathematical Modelling, British Aerospace Sowerby Research
Center, PO Box 5, Filton, Bristol, BS34 7QW, UK
- Omer F Rana
<O.F.Rana@cs.cf.ac.uk>
- C430JGFSIrana/c430shields-rana-walker-li-golby.pdf
- C431:Ajents: Towards an Environment for Parallel, Distributed and
Mobile Java Applications
- Abstract: The rapid proliferation of the World-Wide Web has been due
to the seamless access it provides to information that is distributed both
within organizations and around the world. In this paper, we describe the
design and implementation of a system, called Ajents, which provides the
software infras- tructure necessary to support a similar level of seamless ac-
cess to organization-wide or world-wide heterogeneous com- puting resources.
Ajents introduces class libraries which are written en- tirely in Java and that
run on any standard compliant Java virtual machine. These class libraries
implement and com- bine several important features that are essential to sup-
porting distributed and parallel computing using Java. Such features include:
the ability to easily create objects on re- mote hosts, interact with those
objects through either syn- chronous or asynchronous remote method invocations,
and to freely migrate objects to heterogeneous hosts. Our experimental results
show that in our test environ- ment: we are able to achieve good speedup on a
sample parallel application; the overheads introduced by our imple- mentation
do not adversely aect remote method invocation times; and (somewhat
surprisingly) the cost of migration does not greatly impact the execution time
of an example application.
- Matthew Izatt, Patrick Chan
- Department of Computer Science York University, Toronto, Ontario, M3J 1P3
- {izatt,y-chan}@cs.yorku.ca
- Tim Brecht Department of Computer Science University of Waterloo, Waterloo,
Ontario, N2L 3G1
- brecht@cs.uwaterloo.ca
- C431JGFSIbrecht/C431.pdf
- C432: A Mobile Agent Based Push Methodology for Global Parallel
Computing
- Abstract: The 1990s are seeing the explosive growth of the Internet
and Web-based information sharing and dissemination systems. The Internet is
also showing a potential of forming of a supercomputing resource out of
networked computers. Metacomputing on the Internet often works in a
machine-centric pull execution model. That is, a coordinator
machine maintains a pool of tasks and distributes the tasks to other
participants on demand. This paper proposes a novel mobile agent based
push methodology from the perspective of applications. In the
method, users declare their compute-bound jobs as autonomous agents. The
computational agents will roam on the Internet to find servers to run. Since
the agents can be programmed to satisfy their goals, even if they move and lose
contact with their creators, they can survive intermittent or unreliable
network connection. Dur-ing their lifetime, the agents can also move themselves
autonomously from one machine to another for load balancing, enhancing data
locality, and tolerating faults. We present an agent-oriented programming and
resource brokerage infrastructure, TRAVELER, in support of wide area parallel
applications. TRAVELER provides a mecha-nism for clients to wrap their parallel
applications as mobile agents. It allows clients to dispatch their
computational agents via a resource broker. The broker forms a parallel virtual
machine atop servers to execute the agents. TRAVELER relies on an integrated
dis-tributed shared array runtime system to support agent communications on
clusters of servers. We demonstrated the feasibility of the TRAVELER in
parallel sorting and LU factorization problems.
- Cheng-Zhong Xu and Brian Wims
- Department of Electrical and Computer Engineering Wayne State University,
Detroit, MI 48202
- czxu@ece.eng.wayne.edu
- C432JGFSIxu/c432xu.pdf
- C433:A Tale of Two Directories: Implementing Distributed Shared
Objects in Java
- Abstract: A directory service keeps track of the location and status
of mobile objects in a distributed system. This paper describes our experience
im- plementing two distributed directory protocols as part of the Aleph
toolkit, a distributed shared object system implemented in Java. One protocol
is a conventional home-based protocol, in which a xed node keeps track of the
object's location and status. The other is a novel arrow protocol, based on a
simple path-reversal algorithm. We were surprised to discover that the arrow
protocol outperformed the home protocol, sometimes substantially, across a
range of system sizes. This paper describes a series of experiments testing
whether the discrep- ancy is due to an artifact of the Java run-time system
(such as dierences in thread management or object serialization costs), or
whether it is some- thing inherent in the protocols themselves. In the end, we
use insights gained from these experimental results to design a new directory
protocol
- Maurice Herlihy
- Computer Science Department Brown University Providence, RI 02912
- herlihy@cs.brown.edu
- Michael P. Warres
- Sun Microsystems
- mpw@east.sun.com
- C433JGFSIherlihy/c433.pdf
- C434:A Benchmark Suite for High Performance Java
- Abstract: Increasing interest is being shown in the use of Java for
large scale or Grande applications. This new use of Java places specic demands
on the Java execution environments that could be tested and compared using a
standard benchmark suite. We describe the design and implementation of such a
suite, paying particular attention to Java-specic issues. Sample results are
presented for a number of implementa- tions of the Java Virtual Machine (JVM).
- J. M. Bull, L. A. Smith, M. D. Westhead, D. S. Henty and R. A. Davey
- EPCC, James Clerk Maxwell Building, The King's Buildings, The University of
Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, Scotland, U.K.
- epcc-javagrande@epcc.ed.ac.uk
- markb@epcc.ed.ac.uk
- C434JGFSIbull/c434cpe.pdf
- C435:Annotating Java Class Files with Virtual Registers for
Performance
- Abstract: The Java .class file is a compact encoding of programs for
a stack-based virtual machine. It is intended for use in a networked
environment, which re-quires machine independence and minimized consumption of
network bandwidth. However, as in all interpreted virtual machines, performance
does not match that of code generated for the target machine. We propose
verifiable, machine-independent annotations to the Java class file to bring the
quality of the code generated by a just-in-time compiler closer to
that of an optimizing compiler without a sig-nificant increase in code
generation time. This division of labor has expensive machine-independent
analysis performed off-line and inexpensive machine-dependent code-generation
performed on the client. We call this phenomenon super-linear analysis
and linear exploitation. These annotations were designed mindful of the
concurrency features of the Java language. In this paper we report results from
our a machine-independent, prioritized register assignment. We also discuss
other possible annotations.
- Joel Jones and Samuel Kamin
- Department of Computer Science University of Illinois at Urbana-Champaign
- jjones@uiuc.edu
- C435JGFSIjones/c435cpe1999.pdf
- C436:Javia: A Java Interface to the Virtual Interface
Architecture
- Abstract: The Virtual Interface (VI) architecture has become the
industry standard for user-level network interfaces. This paper presents the
implementation and evaluation of Javia, a Java interface to the VI
architecture. Javia explores two points in the design space. The first approach
manages buffers in C and requires data copies between the Java heap and native
buffers. The second approach relies on a Java-level buffer abstraction that
eliminates the copies in the first approach. Javia achieves an effective
bandwidth of 80Mbytes/sec for 8Kbyte messages, which is within 1% of those
achieved by C programs. Performance evaluations of parallel matrix
multiplication and of the active messages communication protocol show that
Javia can serve as an efficient building block for Java cluster applications.
- Chi-Chao Chang and Thorsten von Eicken
- Department of Computer Science Cornell University
- {chichao,tve}@cs.cornell.edu
- C436JGFSIchang/chang.pdf
- C437:The jCrunch Java Numerical Libraries
- Abstract: The jCrunch project was initiated to address this lack of
high quality numerical libraries. The jCrunch libraries are set of proprietary,
commercial-grade Java numerical libraries. Our approach is very similar to the
F2J project at UTK. Rather than reinvent the wheel by
re-implementing all known numerical algorithms, we have instead decided to
leverage the existing body of Fortran numerical libraries by implementing an
automatic Fortran-to-Java translator.
- William N. Reynolds
- Least Squares Software LLC, PO Box 91405, Albuquerque, NM 87199
- bill@leastsquares.com
- http://www.leastsquares.com
- C437JGFSIreynolds/c437Crunch.pdf
- C438:JCArray the jCrunch(TM) Java Array Classes
- Abstract: In connection with jCrunch TM Lapack, Least Squares
Software has developed Java array classes to encapsulate a Fortran-like data
array implementation. These classes are designed to provide 1-D and 2-D arrays
directly to Native methods while presenting the Java programmer with a useful,
well-behaved, object-oriented API. The proposed representation has much in
common with other proposals, such as JAMA 1 , JNL 2 and NINJA 3 . The principal
differences among these proposals are: degree of exposure to the internal data
representation; persistence; reliance on specific implementations of Blas,
Lapack, Linpack, etc.; and utility methods useful to array jockeys
familiar with APL, Python, etc. Design goals of the JCArray classes include
providing the same API to both pure Java and Native methods, and to support
special matrices such as tri-diagonal, banded, etc.
- David S. Dixon
- Least Squares Software Albuquerque, NM
- bill@leastsquares.com
- http://www.leastsquares.com
- C438JGFSIdixon/c438Poster.pdf
- C439:Extending Java Virtual Machine with Integer-Reference
Conversion
- Abstract: Java virtual machine (JVM) is an architecture-independent
code ex- ecution environment. It is recently used not only for Java language
but also for other languages such as Scheme and ML. On JVM, however, all values
are statically-typed as either immediate or reference, and types are checked
before the execution of a program to prove that invalid memory access will
never occur. This property sometimes makes implementation of other languages on
JVM inecient. In particular, implementation of dynamically-typed language is
very inecient because all possible values including frequently-used ones such
as integers must be represented by instances of a class. In this paper, we
introduce a new type into JVM, which is a super- type of reference types and a
tagged integer type. This allows a more efficient implementation of
dynamically-typed language on JVM. It does not require any new instruction,
maintains binary-compatibility of exist- ing bytecode, and retains the safety
of the original JVM. We modied an existing Scheme system running on JVM to
exploit this extension and got factor of 20 speedup for simple integer
functions. Our extension imposes little performance penalty on existing JVM
code generated from Java; we observed essentially no penalty for Spec JVM
benchmarks.
- Yutaka Oiwa, Kenjiro Taura, Akinori Yonezawa
- University of Tokyo
- oiwa@is.s.u-tokyo.ac.jp
- C439JGFSIoiwa/journal.pdf
- C440:An Annotation-aware JVM Implementation
- Abstract: The Java Bytecode language lacks expressiveness for
traditional compiler optimizations, making this portable, secure software
distribution format inecient as a program representation for high performance.
This ineciency results from the underlying stack model, as well as the fact
that many bytecode oper- ations intrinsically include sub-operations (e.g.,
iaload includes the address computation, array bounds checks and the actual
load of the array element). The stack model, with no operand registers and
limiting access to the top of the stack, prevents the reuse of values and
bytecode reordering. In addition, the bytecodes have no mechanism to indicate
which sub-operations in the bytecode stream are redundant or subsumed by
previous ones. As a consequence, the Java Bytecode language inhibits the
expression of important compiler optimizations, including common sub-expression
elimination, register allocation and instruction scheduling. The bytecode
stream generated by the Java front-end is a signicantly under-optimized
program representation. The most common solution to overcome this aspect of the
language is the use of a Just- in-Time (JIT) compiler to not only generate
native code, but perform optimization as well. However, the latter is a time
consuming operation in an already time-constrained translation process. In this
paper we present an alternative to an optimizing JIT compiler that makes use of
code annotations generated by the Java front-end. These annotations carry
information concerning compiler optimization. During the translation process,
an annotation-aware JVM system then uses this information to produce high-
performance native code without performing much of the necessary analyses or
transformations. We describe the implementation of our rst prototype of an
annotation-aware JVM system consisting of a JIT compilation system. We also
discuss basic ideas on how to verify annotated class les. We conclude the
paper showing performance results comparing our system with other Java Virtual
Machines (JVMs) running on SPARC architecture.
- Ana Azevedo, Alex Nicolau ,Joe Hummel
- University of California, Irvine University of Illinois, Chicago
- aazevedo, nicolau@ics.uci.edu
- jhummel@eecs.uic.edu
- C440JGFSIazevedo/c440cpande99.pdf
- C441:Design of the Kan Distributed Object System
- Abstract: Distributed software problems are often addressed with
object-oriented solutions. Objects provide the benefits of encapsulation and
abstraction that have proven useful in managing the complexity of sequential
code. However, the management of distributed objects is typically by means of
complex APIs, such as CORBA or DCOM. The complexity of the APIs is itself a
hurdle to the writing of efficient, robust programs. An alternate approach is
to provide the programmer with a simple interface to an underlying object
management layer that provides efficient access to objects, reliability, and
sufficient power for common distributed programming tasks. We have implemented
such a system, called Kan. It has a clear, simple object model with powerful
semantics, embodying such concepts as atomic transactions, asynchronous method
calls, and multithreading. The model constructs help the programmer avoid
common concurrent programming errors, allowing clean expressions of concurrent
algorithms. We describe the implemen-tation, and investigate several runtime
optimizations that make performance efficient for some classes of applications.
These optimizations concentrate on reducing method invocation costs. Local
method invocations are optimized with a thread pool, thread inlining, and
pointer swizzling. Remote method invocations are optimized with object
management routines that adapt to access patterns.
- Jerry James and Ambuj Singh
- Department of Computer Science University of California at Santa Barbara
Santa Barbara, California
- jerry@cs.ucsb.edu
- C441JGFSIjames/c441cpe99.pdf
- C442:The Java Memory Model is Fatally Flawed
- Abstract: The Java memory model described in Chapter 17 of the Java
Language Specification gives constraints on how threads interact through
memory. This chapter is hard to interpret and poorly understood; it imposes
constraints that prohibit common compiler optimizations and are expensive to
implement on existing hardware. Most JVMs violate the constraints of the
existing Java memory model; conforming to the existing specification would
impose significant performance penalties. In addition, programming idioms used
by some programmers and used within Sun's Java Development Kit is not
guaranteed to be valid according the existing Java memory model. Furthermore,
implementing Java on a shared memory multiprocessor that implements a weak
memory model poses some implementation challenges not previously considered.
- William Pugh
- Dept. of Computer Science Univ. of Maryland, College Park
- pugh@cs.umd.edu
- C442JGFSIpugh/jmm2.pdf
- C443:Javelin++: Scalability Issues in Global Computing
- Abstract: Javelin is a Java-based infrastructure for global
computing. This paper presents Javelin++, an extension of Javelin, intended to
support a much larger set of computational hosts. First, Javelin++'s switch
from Java applets to Java applications is explained. Then, two scheduling
schemes are presented: probabilistic work stealing and deterministic work
stealing. The distributed deterministic work stealing is integrated with a
distributed deterministic eager scheduler. An additional fault tolerance
mechanism is implemented for replacing hosts that have failed or retreated. A
Javelin++ API is sketched, then illustrated on a raytracing application.
Performance results for the two schedulers are reported, indicating that
Javelin++, with its broker network, scales better than the original Javelin.
- Michael O. Neary, Sean P. Brydon, Paul Kmiec, Sami Rollins, Peter Cappello,
- Department of Computer Science University of California, Santa Barbara
Santa Barbara, CA 93106
- {neary, brydon, virus, srollins,
cappello}@cs.ucsb.edu
- C443JGFSIneary/c443top.pdf
- C444:Design, Implementation, and Evaluation of Optimizations in a
Java Just-In-Time Compiler
- Abstract:The Java language incurs a runtime overhead for exception
checks and object accesses, which are executed without an interior pointer in
order to ensure safety. It also requires type inclusion test, dynamic class
loading, and dynamic method calls in order to ensure flexibility. A
Just-In-Time (JIT) compiler generates native code from Java byte
code at runtime. It must improve the run-time performance without compromising
the safety and flexibility of the Java language. We designed and implemented
effective optimizations for a JIT compiler, such as exception check
elimination, common subexpression elimination, simple type inclusion test,
method inlining, and devirtualization of dynamic method call. We evaluate the
performance benefits of these optimizations based on various statistics
collected using SPECjvm98, its candidates, and two JavaSoft applications with
byte code sizes ranging from 23000 to 280000 bytes. Each optimization
contributes to an improvement in the performance of the programs.
- Kazuaki Ishizaki, Motohiro Kawahito, Toshiaki Yasue, Mikio Takeuchi,
Takeshi Ogasawara, Toshio Suganuma, Tamiya Onodera, Hideaki Komatsu, and Toshio
Nakatani
- IBM Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato, Kanagawa,
242-8502, Japan
- ishizaki@trl.ibm.co.jp
- C444JGFSIishizaki/ishizaki.pdf
- C445:Complex numbers for Java
- Abstract:Efficient and elegant complex numbers are one of the
preconditions for the use of Java in scientific computing. This paper
introduces a preprocessor and its translation rules that map a new basic type
complex and its operations to pure Java. For the mapping is insufficient to
just replace one complex-variable with two double-variables. Compared to code
that uses Complex objects and method invocations to express arithmetic
operations the new basic type increases readability and it is also executed
faster. On average, the versions of our benchmark programs that use the basic
type outperform the class-based versions by a factor of 2 up to 21 (depending
on the JVM used).
- Michael Philippsen and Edwin Günthner
- University of Karlsruhe
- JavaParty@ira.uka.de
- http://wwwipd.ira.uka.de/JavaParty/
- C445JGFSIcomplexphilippsen/complexe.pdf