Next: Unstructured Communication Up: Communication Generation Previous: Communication Generation

Structured Communication

All the examples discussed below have the following mapping directives.


         CHPF$ PROCESSORS(P,Q)
         CHPF$ DISTRIBUTE TEMPL(BLOCK,BLOCK)
         CHPF$ ALIGN A(I,J) WITH TEMPL(I,J)  
         CHPF$ ALIGN B(I,J) WITH TEMPL(I,J)

Example 1 (transfer) Consider the statement:


         FORALL(I=1:N) A(I,8)=B(I,3)

The first subscript of is marked as no_communication because and are aligned in the first dimension and have identical indices. The second dimension is marked as transfer.


   1.  call set_BOUND(lb,ub,st,1,N,1) 
   2.  call set_DAD(B_DAD,.....)      
   3.  call transfer(B, B_DAD, TMP,src=global_to_proc(8), 
                      dest=global_to_proc(3))
   4.  DO I=lb,ub,st
   5.     A(I,global_to_local(8)) = TMP(I)
   6.  END DO

In the above code, the set_BOUND primitive (line 1) computes the local bounds for computation assignment based on the iteration distribution (Section ). In line 2, the primitive set_DAD is used to fill the Distributed Array Descriptor (DAD) associated with array so that it can be passed to the transfer communication primitive at run-time. The DAD has sufficient information for the communication primitives to compute all the necessary information including local bounds, distributions, global shape etc. Note that transfer performs one-to-one send-receive communication based on the logical grid. In this example, one column of grid processors communicate with another column of the grid processors as shown in Figure (a).

Example 2 (multicast) Consider the statement:


         FORALL(I=1:N,J=1:M) A(I,J)=B(I,3)

The second subscript of marked as multicast and the first as no_communication.


   1.    call set_BOUND(lb,ub,st,1,N,1) 
   2.    call set_BOUND(lb1,ub1,st1,1,M,1) 
   3.    call set_DAD(B_DAD,.....)        
   4.    call multicast(B, B_DAD, TMP,
       &      source_proc=global_to_proc(3), dim=2) 
   5.    DO I=lb,ub,st
   6.    DO J=lb1,ub1,st1
   7.       A(I,J) = TMP(I)
   8.    END DO

Line 4 shows a broadcast along dimension 2 of the logical processor grid by the processors owning elements where (Figure (b).)

Example 3 (multicast_shift) Consider the statement:


         FORALL(I=1:N,J=1:M) A(I,J)=B(3,J+s)

The first subscript of array is marked as multicast and the second subscript is marked as temporary_shift. The above communication can be implemented as two separate communication steps: multicast along the first dimension of logical grid and temporary_shift along the second dimension of the logical grid. Alternatively, the two communication patterns can be composed together to obtain a better communication primitive such as the multicast_shift primitive.


         call set_BOUND(lb,ub,st,1,N,1) ! compute local lb, ub, and st
         call set_BOUND(lb1,ub1,st1,1,M,1) ! compute local lb, ub, and st
         multicast_shift(B, B_DAD,TMP, source=global_to_proc(3), 
       &      shift=s, multicast_dim=1, shift_dim=2)
         DO I=lb,ub,st
         DO J=lb1,ub1,st1
            A(I,J)=TMP(J)
         END DO
         END DO

Combining two primitives eliminates the need for creating temporary storage and eliminates some of intra-processor copying, message-packing, and unpacking.



Next: Unstructured Communication Up: Communication Generation Previous: Communication Generation


zbozkus@
Thu Jul 6 21:09:19 EDT 1995