Next: An Example Program Up: Optimizations Previous: Forall Loop Interchange

Forall Mask Insertion

Recent parallel computers such as Intel Paragon and Thinking Machine CM-5 have vector units for each processor. However, most vector units does not perform well if the loop has a branch instruction in the loop. Such loops may not be vectorized. A mask may cause the generated loop to be un-vectorizable. Fortran 90D/HPF compiler tries to insert the mask with only depended indices not all indices. For example, Gaussian Elimination code has a forall:


       forall (i = 1:N, j = k:NN, indx(i) .EQ. -1)
     &        a(i,j) = a(i,j) - fac(i)*row(j)

Here, mask does not depend on the j index, it only depends on i, so we transform as follows:


   do i=..
   if(index(i).eq.-1) then
      do j=..
         a(i,j)=....
      enddo
   endif
   enddo

The inner loop becomes a vectorizable loop.


zbozkus@
Thu Jul 6 21:09:19 EDT 1995