next up previous
Next: Adding serialization to the Up: Object Serialization for Marshalling Previous: Related work

 

Datatypes in an MPI-like API for Java

The MPI standard is explicitly object-based. The C++ binding specified in the MPI 2 standard collects these objects into suitable class hierarchies and defines most of the library functions as class member functions. The Java API proposed in [4] follows this model, and lifts its class hierarchy directly from the C++ binding of MPI.

In our Java version a class MPJ with only static members acts as a module containing global services, such as initialization of the message-passing layer, and many global constants including a default communicator COMM_WORLDgif. The communicator class Comm is the single most important class in MPI. All communication functions are members of Comm or its subclasses. Another class that is relevant for the discussion below is the Datatype class. This describes the type of the elements in the message buffers passed to send, receive, and other communication functions. Various basic datatypes are predefined in the package. These mainly correspond to the primitive types of Java, shown in figure 1.

  figure947
Figure 1: Basic datatypes in proposed Java binding 

The methods corresponding to standard send and receive operations of MPI are members of Comm with interfaces

        void send(Object buf, int offset, int count,
                  Datatype datatype, int dst, int tag)

        Status recv(Object buf, int offset, int count,
                    Datatype datatype, int src, int tag)

In both cases the actual argument corresponding to buf must be a Java array with element type compatible with the datatype argument. If the specified type corresponds to a primitive type, the buffer must be a one-dimensional array. Multidimensional arrays can be communicated directly if an object type is specified, because an individual array can be treated as an object. Communication of object types implies some form of serialization and unserialization. This could be the built-in serialization provided in current Java environments, or (as we discuss at length in section 5) it could be some specialized serialization tuned for message-passing.

Besides object types the draft Java binding proposal retains a model of MPI derived datatypes. In C or Fortran bindings of MPI, derived datatypes have two roles. One is to allow messages to contain mixed types. The other is to allow noncontiguous data to be transmitted. The first role involves using the MPI_TYPE_STRUCT derived data constructor, which allows one to describe the physical layout of, say, a C struct containing mixed types. This will not work in Java, because Java does not expose the low-level layout of its objects. In C or Fortran MPI_TYPE_STRUCT also allows one to incorporate displacements computed as differences between absolute addresses, so that parts of a single message can come from separately declared arrays and other variables. Again there is no very natural way to do this in Java. (But effects similar to of these uses of MPI_TYPE_STRUCT can be achieved by using MPJ.OBJECT as the buffer type, and relying on object serialization.)

We conclude that in the Java binding the first role of derived dataypes should probably be abandoned--derived types can only include elements of a single basic type. This leaves description of noncontiguous buffers as the remaining role for derived data types. Every derived data type constructable in the Java binding has a uniquely defined base type. This is one of the 9 basic types enumerated above. A derived datatype is an object that specifies two things: a base type and a sequence of integer displacements. (In contrast to the C and Fortran bindings the displacements can be interpreted in terms of subscripts in the buffer array argument, rather than as byte displacements.)

An MPI derived dataype constructor such as MPI_TYPE_INDEXED, which allows an arbitray indirection array, has a potentially useful role in Java. It allows to send (or receive) messages containing values scattered randomly in some one-dimensional array. The draft proposal incorporates versions of this and other type constructors from MPI including MPI_TYPE_VECTOR for strided sections.


next up previous
Next: Adding serialization to the Up: Object Serialization for Marshalling Previous: Related work

Bryan Carpenter
Thu Nov 4 13:48:00 EST 1999