next up previous
Next: Summary Up: Programming examples in HPJava Previous: Choleski decomposition

Fox's algorithm for matrix multiplication

Recall that if A and B are square matrix of order n, then C=AB is also a square matrix of order n, and tex2html_wrap_inline743 is obtained by taking the dot product of the ith row of A and jth column of B. That is,

displaymath744

Fox's algorithm [3] for multiplication organize A, B and C into submatrix on a P by P process array. It take P steps, in each step, broadcast corresponding submatrix of A on each row of the processes, do local computation and then shift array B for the next step computation. Write this algorithm in HPJava, we still use Adlib.remap to broadcast submatrix, matmul is a subroutine for local matrix multiplication. Adlib.shift is used to shift array B, and Adlib.copy copies data back after shift, it can also be implemented as nested over and for loops.

  Group p=new Procs2(P,P);

  Range x=p.dim(0);
  Range y=p.dim(1);

  on(p) {
    float [[#,#, , ]] a = new float [[x,y,B,B]]; 
    float [[#,#, , ]] b = new float [[x,y,B,B]]; 
    //input a, b here;

    float [[#,#, , ]] c = new float [[x,y,B,B]]; 
    float [[#,#, , ]] temp = new float [[x,y,B,B]]; 
                  
    for (int k=0; k<p; k++) {
      over(Location i=x|:) {
        float [[,]] sub = new float [[B,B]];    

        //Broadcast submatrix in 'a' ...
        Adlib.remap(sub, a[[i, (i+k)%P, z, z]]);

        over(Location j=y|:) { 
          //Local matrix multiplication ...
          matmul(c[[i, j, z, z]], sub, b[[i, j, z, z]]);
        }
      }
          
      //Cyclic shift 'b' in 'y' dimension ...
      Adlib.shift(tmp, b, 1, 0, 0); // dst, src, shift, dim, mode;
      Adlib.copy(b, tmp);
    }
  }



Guansong Zhang
Mon Dec 15 14:11:59 EST 1997