MPI Java Wrapper Implementation

by Yuh-Jye Chang, Bryan Carpenter, Geoffrey Fox

Northeast Parallel Architectures Center,
Syracuse University,
Syracuse, New York

Feb 2, 1996


  • 1. MPI Java wrapper introduction
  • 2. MPI Java wrapper design
  • 3. Java classes for MPI
  • 4. Class methods for Java MPI
  • 5. Java native method
  • 6. Java datatypes
  • 7. Problems due to strong typing and no pointer
  • 8. The polymorphism of Java Datatype class
  • 9. Other problems
  • 10. Conclusion
  • 11. Test example
  • 12. Execution result
  • 13. List of detailed Java wrapper for MPI

    1. MPI Java wrapper introduction

    This draft presents a Java language interface for MPI. There are some issues specific to Java that must considered in the design of this interface that go beyond the simple description of language bindings. In particular, in Java, we must be concerned with the design of objects, their methods, the feature of Java native methods, rather than just the design of a language-specific functional interface to MPI. Fortunately, the original design of MPI was based around the notion of objects, so a natural set of classes is already part of MPI.

    2. MPI Java wrapper design

    The Java wrapper for MPI is designed according to the following criteria:

    • The Java wrapper for MPI consists of a small set of classes with a lightweight functional interface to MPI. The classes are based upon the fundamental MPI object types (e.g. communicator, group, etc.).
    • The Java wrapper language bindings provide a semantically correct interface to MPI.
    • There is a one-to-one mapping between MPI functions and their Java wrapper bindings.
    • To the greatest extent possible, the Java wrapper for MPI functions are methods functions of MPI classes.

    3. Java classes for MPI

    All MPI classes, constants, and methods are declared within the scope of an MPI package. Thus, by import the MPI package or using the prefix, we can reference the MPI Java wrapper. The classes of the MPI package are those classes corresponding to objects implicitly used by MPI. An abbreviated definition of the MPI package and its member classes is as follows:

    package MPI; public class MPI; public class Comm; public class Group; public class Datatype; public class Op; public class Status; public class Request; public class Errhandler; ...

    4. Class methods for Java MPI

    All methods (except for constructors and destructors) of MPI classes are public native. Which means in Java program the methods identifier and arguments are defined without further implementation.

    Example 1 Example showing a simple Java MPI wrapper usage. :
    import MPI.*;
    public class Example1 {
      static public void main(String[] args) {
        MPI JMPI = new MPI(args);
        int myid = JMPI.COMM_WORLD.Rank();
        int numprocs = JMPI.COMM_WORLD.Size(); 
        System.out.println("Process "+myid+"/"+numprocs+
          " on "+JMPI.Get_processor_name());
    MPI JMPI = new MPI(args);
    This statement will create a MPI class instance called JMPI. The MPI classes constructor will transform the String[] arguments into C style string array reference, call MPI_Init, create communicator COMM_WORLD, create default MPI reduce operators MIN, MAX, SUM, * etc.

    int myid = JMPI.COMM_WORLD.Size();
    int numprocs = JMPI.COMM_WORLD.Rank();
    Now, two native methods belong to COMM_WORLD were called. The COMM_WORLD is a Comm class instance which initiated in MPI() constructor. The COMM_WORLD communicator provide all the MPI communication binding that use the MPI_COMM_WORLD communicator. As we had mention before, the Size() and Rank() methods will return a integer value.

    System.out.println("Process "+myid+"/"+numprocs+
      " on "+JMPI.Get_processor_name());
    Output the myid , numberprocs, and processor_name into standard output.

    The last step that conclude the MPI usage is calling finalize() method.

    5. Java native method

    The Java native method is a great way to gain and merge the power of C or C++ programming into Java. To use Java as a scientific and high performance language, when efficient native Java compilers are not fully implemented, use native method can boost the performance to at least the speed of C compiled code.

    Example 2 Example showing how Java native method works.
    • :
      public class JMPI {
        public native int Init(String[] args);
        public native int Finalize();
        static {
    • JMPI.h : (created by javah and JMPI.class)
      /* DO NOT EDIT THIS FILE - it is machine generated */
      #include <native.h>
      /* Header for class JMPI */
      #ifndef _Included_JMPI
      #define _Included_JMPI
      typedef struct ClassJMPI {
          char PAD;	/* ANSI C requires structures to have a least one member */
      } ClassJMPI;
      #ifdef __cplusplus
      extern "C" {
      struct Hjava_lang_String;
      extern long JMPI_Init(struct HJMPI *,HArrayOfString *);
      extern long JMPI_Finalize(struct HJMPI *);
      #ifdef __cplusplus
    • JMPI.c : (created by javah -stub and JMPI.class)
      /* DO NOT EDIT THIS FILE - it is machine generated */
      #include <StubPreamble.h>
      /* Stubs for class JMPI */
      /* SYMBOL: "JMPI/Init([Ljava/lang/String;)I", Java_JMPI_Init_stub */
      stack_item *Java_JMPI_Init_stub(stack_item *_P_,struct execenv *_EE_) {
      	extern long JMPI_Init(void *,void *);
      	_P_[0].i = JMPI_Init(_P_[0].p,((_P_[1].p)));
      	return _P_ + 1;
      /* SYMBOL: "JMPI/Finalize()I", Java_JMPI_Finalize_stub */
      stack_item *Java_JMPI_Finalize_stub(stack_item *_P_,struct execenv *_EE_) {
      	extern long JMPI_Finalize(void *);
      	_P_[0].i = JMPI_Finalize(_P_[0].p);
      	return _P_ + 1;
    • JMPINative.c :
      #include "mpi.h"
      #include "JMPI.h"
      #include "stdlib.h"
      long JMPI_Init(struct HJMPI *this, HArrayOfString *args) {
        int i, result, len;
        char** sargs; 
        HString **data = unhand(args)->body;
        len = obj_length(args);
        sargs = (char**)calloc(len, sizeof(char*));
        for (i=0; i<len; i++) {
          sargs[i] = allocCString(data[i]);
        result = MPI_Init(&len, &sargs);
        for (i=0; i<len; i++) free(sargs[i]);
        return result;
      long JMPI_Finalize(struct HJMPI *this) {
        return MPI_Finalize();
    The only programs user created are and JMPINative.c. The JMPI.h and JMPI.c are generated by javah and compiled JMPI.class files. Compile the JMPI.c and JMPINative.c into (in UNIX) or JMPI.dll (in Microsoft Windows) and you are done.

    6. Java datatypes

    The following table lists all of the Java basic simple type and their corresponding C/C++ and MPI datatype.

    Java datatype C/C++ datatype MPI datatype
    byte signed char MPI_CHAR
    char signed short int MPI_SHORT
    short signed short int MPI_SHORT
    boolean signed long int MPI_LONG
    int signed long int MPI_LONG
    long signed long long int MPI_LONG_LONG_INT
    float float MPI_FLOAT
    double double MPI_DOUBLE

    Because Java is platform independent, the size of simple type will be the same in all platforms. So in order to fit into some system that has 64bits pointer, we use the long in Java to store the MPI object handle or pointer reference.

    7. Problems due to strong typing and no pointer

    All MPI functions with choice arguments associate actual arguments of different datatypes with the same dummy argument. This is not allowed by Java. In C, the void* formal arguments avoid these problems.

    The following code fragment is technically illegal and may generate a compile-time error.

      float f = new float[10];
      double r = new double[10];
      MPI.COMM_WORLD.Send(f, ...);
      MPI.COMM_WORLD.Send(r, ...);
    Technically, we will have to use methods overload with different argument datatype or methods with different identifier.

    The methods overload implementation in native method will cause problem. Because the methods that has the same name will have the same sub initialization function generated by javah.

    The methods with different identifier will implement as following.

      MPI.COMM_WORLD.SendFloat(f, ...);
      MPI.COMM_WORLD.SendDouble(r, ...);
    But, there are many MPI communication functions, eg. MPI_Send, MPI_Bsend, MPI_Ssend, MPI_Rsend, * etc. If we use this approach, than we are going to have tons of native methods for each functions and datatypes. Which we believe is quite a waste. So we introduce a Java Datatype class which perform the polymorphism between different Java datatypes.

    8. The polymorphism of Java Datatype class

    The Datatype class listed partly as following.
    package MPI ;
    public class Datatype {
      public Datatype() { handle = type = 0;}
      public Datatype(byte[] data) { SetupByte(data);}
      public Datatype(char[] data) { SetupChar(data);}
      private native void SetupByte(byte[] data);
      private native void SetupChar(char[] data);
      private long handle, type;
      private int size;
    There are constructors for each java datatype array. In each constructor, it will invoke a native method with different identifier that store the memory address into 64 bits *handle* variable, store the corresponding MPI_Datatype object (eg. MPI_CHAR, MPI_SHORT, *) into 64 bits *type* variable, and the buffer size into *size* variable.

    The original MPI call in C/C++ :

    MPI_Send(void*, int size, MPI_Datatype, int dest, int tag, MPI_Comm);
    would become much simpler in Java :

    MPI.Comm.Send(MPI.Datatype, int dest, int tag);

    9. Other problems

    • The creation of MPI_Op and MPI_Request will involve the pointer to function, which is not allowed in Java also. But we think this problem could be resolve by invoke a Java class and invoke its method automatically.
    • The current implementation of MPI conflict with Java seriously. When number of processor > 1, use mpirun to invoke Java interpreter will core dump or even hang the processes. We already reflect this problem to MPI implementation authors. Hopefully they will solve this problem soon. Currently, Bryan Carpenter modify part of the MPI source code and it works quite good (but will still core dump when np > 3). Thanks to his patch, we can have further progress in this implementaion.

    10. Conclusion

    To propose the Java as the scientific and high performance language, we believe that the Java MPI wrapper is a very useful and important step. Which shows the Java versatility and flexibility. We also believe that Java will play a very important role in scientific and high performance world.

    11. Test example: life_mpi :
    import MPI.*;
    class life_mpi {
      static public void main(String[] args) {
        int id, np;
        Status stat;
        String host;
        MPI JMPI = new MPI(args);
        host = JMPI.Get_processor_name();
        np = JMPI.COMM_WORLD.Size();
        id = JMPI.COMM_WORLD.Rank();
        System.out.println("Process "+id+"/"+np+" on "+host);
        int i, j, y, base;
        int N = 11;
        int NITER = 3;
        // Define block
        int[] bsize = new int[1];
        if (N < np) np = N;
        int blockSize = N/np;
        int blockBase = blockSize*id+Math.min(id, N%np);
        if (id < N%np) blockSize++;
        // `block' has `blockSize + 2' columns.  This allows for ghost cells.
        byte block[][] = new byte[blockSize+2][N];
        byte board[][] = (id == 0) ? new byte[N][N] : null;
        byte buffer[]  = (id == 0) ? new byte[N] : null;
        for(i = 0 ; i < blockSize ; i++) {
          int ib = i+1;
          for(y = 0 ; y < N ; y++) {
            int x = blockBase + i ;
            if(x == N / 2 || y == N / 2)
              block [ib] [y] = 1 ;
              block [ib] [y] = 0 ;
        // Main update loop.
        int next = (id+1)%np;
        int prev = (id+np-1)%np;
        int neighbours [][] = new int [blockSize] [N] ;
        for(int iter = 0 ; iter < NITER ; iter++) {
          // Dump state of board to host
          if (id == 0) { // the host
            for (i=0; i < blockSize; i++)
              System.arraycopy(block[i+1], 0, board[i], 0, N);
            base = blockSize;
            for (j=1; j < np; j++) {
              JMPI.COMM_WORLD.Recv(new Datatype(bsize), j, 0);
              for (i=0; i < bsize[0]; i++) {
                JMPI.COMM_WORLD.Recv(new Datatype(buffer), j, 0);
                System.arraycopy(buffer, 0, board[base+i], 0, N);
            System.out.println("Dump Board Status:");
            for (i=0; i < N; i++) {
              for (y=0; y < N; y++)
                System.out.print(board[i][y]+" ");
          } else { // the slave
            bsize[0] = blockSize;
            JMPI.COMM_WORLD.Send(new Datatype(bsize), 0, 0);
            for(i=0 ; i < blockSize ; i++)
              JMPI.COMM_WORLD.Send(new Datatype(block[i+1]), 0, 0);
          // Shift this block's upper edge into next neighbour's lower ghost edge
          JMPI.COMM_WORLD.Send(new Datatype(block[blockSize]), next, 0);
          JMPI.COMM_WORLD.Recv(new Datatype(block[0]), prev, 0);
          // Shift this block's lower edge into prev neighbour's upper ghost edge
          JMPI.COMM_WORLD.Send(new Datatype(block[1]), prev, 0);
          JMPI.COMM_WORLD.Recv(new Datatype(block[blockSize+1]), next, 0);
          //  Calculate a block of neighbour sums.
          for(i = 0 ; i < blockSize ; i++) {
            int ib = i + 1 ;
            for(y = 0 ; y < N ; y++) {
              int y_n = (y+N-1) % N ;
              int y_p = (y+1) % N ;
                neighbours [i] [y] =
                block [ib - 1] [y_n] + block [ib - 1] [y] + block [ib - 1] [y_p] +
                block [ib]     [y_n] +                      block [ib]     [y_p] +
                block [ib + 1] [y_n] + block [ib + 1] [y] + block [ib + 1] [y_p] ;
          // Update block of board values.
          for(i = 0 ; i < blockSize ; i++) {
            int ib = i + 1 ;
            for(y = 0 ; y < N ; y++) {
              int neighbour = neighbours [i] [y] ;
              if(neighbour < 2 || neighbour > 3)
                block [ib] [y] = 0 ;
              if(neighbour == 3)
                block [ib] [y] = 1 ;

    12. Execution result

    Dump Board Status:
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    1 1 1 1 1 1 1 1 1 1 1
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    Dump Board Status:
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    1 1 1 1 0 0 0 1 1 1 1
    1 1 1 1 0 0 0 1 1 1 1
    1 1 1 1 0 0 0 1 1 1 1
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    0 0 0 0 1 1 1 0 0 0 0
    Dump Board Status:
    0 0 0 1 0 0 0 1 0 0 0
    0 0 0 1 0 0 0 1 0 0 0
    0 0 0 1 0 0 0 1 0 0 0
    1 1 1 0 0 0 0 0 1 1 1
    0 0 0 0 0 1 0 0 0 0 0
    0 0 0 0 1 0 1 0 0 0 0
    0 0 0 0 0 1 0 0 0 0 0
    1 1 1 0 0 0 0 0 1 1 1
    0 0 0 1 0 0 0 1 0 0 0
    0 0 0 1 0 0 0 1 0 0 0
    0 0 0 1 0 0 0 1 0 0 0

    13. List of detailed Java wrapper for MPI

    • : package MPI ; public class MPI { private int MAX_PROCESSOR_NAME; public Comm COMM_WORLD; public Op MAX, MIN, SUM, PROD, LAND, BAND, LOR, BOR, LXOR, BXOR, MINLOC, MAXLOC; public MPI(String[] args); public void finalize() { Finalize();} private native void Init(String[] args); private native void Finalize(); public native double Wtime(); public native double Wtick(); private native int Get_processor_name(byte[] buf); public String Get_processor_name(); public native void Buffer_attach(byte[] buf); public native void Buffer_detach(byte[] buf); static { System.loadLibrary("MPI"); } }
    • : package MPI ; public class Comm { public final static int NULL = 0; public final static int SELF = 1; public final static int WORLD = 2; public Comm() { handle = 0;} public Comm(int Type) { Setup(Type);} private native void Setup(int Type); public native void Barrier(); public native int Size(); public native int Rank(); public native void Send(Datatype buf, int dest, int tag); public native void Bsend(Datatype buf, int dest, int tag); public native void Ssend(Datatype buf, int dest, int tag); public native void Rsend(Datatype buf, int dest, int tag); private native Status Recv(Datatype buf, int source, int tag, Status stat); public Status Recv(Datatype buf, int source, int tag) { return Recv(buf, source, tag, new Status());} public native void Bcast(Datatype buf, int root); public native void Gather(Datatype sbuf, Datatype rbuf, int root); public native void Scatter(Datatype sbuf, Datatype rbuf, int root); public native void Allgather(Datatype sbuf, Datatype rbuf); public native void Alltoall(Datatype sbuf, Datatype rbuf); public native void Reduce(Datatype sbuf, Datatype rbuf, Op op, int root); private long handle ; }
    • : package MPI ; public class Datatype { public Datatype() { handle = type = 0;} public Datatype(byte[] data) { SetupByte(data);} public Datatype(char[] data) { SetupChar(data);} public Datatype(short[] data) { SetupShort(data);} public Datatype(boolean[] data) { SetupBoolean(data);} public Datatype(int[] data) { SetupInt(data);} public Datatype(long[] data) { SetupLong(data);} public Datatype(float[] data) { SetupFloat(data);} public Datatype(double[] data) { SetupDouble(data);} public Datatype(String data) { SetupString(data);} private native void SetupByte(byte[] data); private native void SetupChar(char[] data); private native void SetupShort(short[] data); private native void SetupBoolean(boolean[] data); private native void SetupInt(int[] data); private native void SetupLong(long[] data); private native void SetupFloat(float[] data); private native void SetupDouble(double[] data); private native void SetupString(String data); private long handle, type; private int size; }
    • : package MPI ; public class Op { public final static int NULL = 0; public final static int MAX = 1; public final static int MIN = 2; public final static int SUM = 3; public final static int PROD = 4; public final static int LAND = 5; public final static int BAND = 6; public final static int LOR = 7; public final static int BOR = 8; public final static int LXOR = 9; public final static int BXOR =10; public final static int MINLOC=11; public final static int MAXLOC=12; public Op() { handle = 0;} public Op(int Type) { Setup(Type);} private native void Setup(int Type); private long handle ; }
    • : package MPI ; public class Status { public int count; public int source; public int tag; public int error; }
    • : package MPI ; public class Request { private long handle; }