A possible architecture is sketched in Figure 1.
Figure 1: Layers of an MPJ reference implementation
The bottom level, process creation and monitoring, incorporates initial negotiation with the MPJ daemon, and low-level services provided by this daemon, including clean termination and routing of output streams. The daemon invokes the MPJSlave class in a new JVM. MPJSlave is responsible for downloading the user's application and starting that application. It may also directly invoke routines to initialize the message-passing layer. Overall, what this bottom layer provides to the next layer is a reliable group of processes with user code installed. It may also provide some mechanisms--presumably RMI-based (we assume that the whole of the bottom layer is built on RMI)--for global synchronization and broadcasting simple information like server port numbers.
The next layer manages low-level socket connections. It establishes all-to-all TCP socket connections between the hosts.
The idea of an ``MPJ device'' level is modelled on the abstract device interface of MPICH. A minimal API includes non-blocking standard-mode send and receive operations (analogous to MPI_ISEND and MPI_IRECV, and various wait operations--at least operations equivalent to MPI_WAITANY and MPI_TESTANY). All other point-to-point communication modes can be implemented correctly and with reasonable efficiency on top of this minimal set. Unlike the MPICH device level, we do not incorporate direct support for groups, communicators and (necessarily) datatypes at this level (but we do assume support for message contexts). Message buffers will probably be byte arrays. The device level is intended to be implemented on socket send and recv operations, using standard Java threads and synchronization methods to achieve its richer semantics.
The next layer is base-level MPJ, which includes point-to-point communications, communicators, groups, datatypes and environmental management. On top of this are higher-level MPJ operations including the collective operations. We anticipate that much of this code can be implemented by fairly direct transcription of the src subdirectories in the MPICH release--the parts of the MPICH implementation above the abstract device level.