next up previous contents
Next: References Up: Experiments with ``HPJava'' Previous: ``Array syntax'' in Java.

 

Discussion

We have explored the practicality of doing parallel computing in Java, and of providing Java interfaces to High Performance Computing software. For various reasons, the success of this exercise was not a foregone conclusion. Java sits on a virtual machine model that is significantly different to the hardware-oriented model which C or Fortran exploit directly. Java discourages or prevents direct access to the some of the fundamental resources of the underlying hardware (most extremely, its memory).

Our earliest experiments in this direction (including the work described in section 4, which predates the MPI work) involved working entirely within Java, building new software on top of the communication facilities of the standard API. The more recent work in sections 3.2 and 3.3 involved creating a Java interface to an existing HPC package. Which is the better strategy? In the long term Java may become a major implementation language for large software packages like MPI. It certainly has advantages in respect of portability that could simplify implementations dramatically. In the immediate term recoding these packages does not appear so attractive. Java wrappers to existing software look more sensible. On a cautionary note, our experience with MPI suggests that interfacing Java to non-trivial communication packages may be less easy than it sounds. Nevertheless, we intend in the future to create a Java interface to an existing run-time library for data parallel computation.

So is Java, as it stands, a good language for High Performance Computing?

It still has to be demonstrated that Java can be compiled to code of efficiency comparable with C or Fortran. Many avenues are being followed simultaneously towards a higher performance Java. Besides the Java chip effort of Sun, it has been reported at this workshop that IBM is developing an optimizing Java compiler which produces binary code directly, that Rice University and Rochester University are working on optimization and restructuring of bytecode generated by javac, and that Indiana University is working on source restructuring to parallelize Java. Parallel interpretation of bytecode is also an emerging practice. For example, the IBM JVM, an implementation of JVM on shared memory architectures, was released in spring 1996, and UIUC has recently started work aimed at parallel interpretation of Java bytecode for distributed memory systems.

Another promising approach under investigation [18] is to integrate interpretation and compilation techniques for parallel execution of Java programs. In such a system, a partially ordered set of interpretive frames is generated by an II/CVM compiler. A frame is a description of some subtask, whose granularity may range from a single scalar assignment statement to a solver for a system of equations. Under supervision of the virtual machine (II/CVM), the actions specified in a frame may be performed in one of three ways:

With this approach, optimized binary codes for well formed computation subtasks exist in runtime libraries, supporting a high level interpretive environment. Task parallelism is observed among different frames executed by the three mechanisms simultaneously, while data parallelism is observed in the execution of some of the runtime functions.

Presuming these efforts satisfactorily address the performance issue, the second aspect in question concerns expressiveness of the Java language. Our final interface to MPI is quite elegant, and provides much of the functionality of the standard C and Fortran bindings. But creating this interface was a more difficult process than one might hope, both in terms of getting a good specification, and in terms of making the implementation work. In section 4 we noted that the lack of features like C++ templates (or any form of parametric polymorphism) and user-defined operator overloading (available in many modern languages, from functional programming languages to Fortran) made it difficult to produce a completely satisfying interface to a data parallel library. The Java language as currently defined imposes various limits to the creativity of the programmer.

In many respects Java is undoubtedly a better language than Fortran. It is object-oriented to the core and highly dynamic, and there is every reason to suppose that such features will be as valuable in scientific computing as in any other programming discipline. But to displace established scientific programming languages Java will surely have to acquire some of the facilities taken for granted in those languages.


next up previous contents
Next: References Up: Experiments with ``HPJava'' Previous: ``Array syntax'' in Java.

Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu