Next: Optimizations Up: Communication Model Previous: Run-time Support System

Storage Management

Data-parallel scientific codes generally require tremendous amounts of memory. In addition to speed-up, this is one of the reasons to run scientific code on distributed memory parallel machines. Besides the user's data, an HPF compiler must create storage for several reasons:

  1. When an array expression is passed to a subroutine, storage must be created to hold the value of the expression.
  2. When an array-valued function is used in an expression, storage must be created to hold the return value of the function.
  3. When a forall statement, if scalarized, would carry a dependency, storage must be created to hold the value of the right hand side.
  4. When a transformational function is referenced within a forall, storage must be created to hold the result of the transformational function.
  5. The communications strategy requires creation of storage for nonlocal array references.

The simplest approach to storage management may allocate full-sized arrays on each processor for all the above cases but this strategy could waste tremendous amount of memory. The compiler should use sophisticated storage management techniques to reduce memory use. Fortran 90D/HPF uses two different storage allocation techniques, overlap areas and temporary arrays.

Overlap areas are expansions of local array sections to accommodate neighboring nonlocal elements. Overlaps are useful for regular computation because they allow the generation of clear and readable code. However, for certain computations storage may be wasted because all array elements between the local section and the one accessed must also be part of the overlap. Storage is also wasted because overlaps are assigned to individual arrays, and cannot be reused or released until arrays have completed their lifetimes. Our compiler uses the overlap storage for overlap_shift communication for small shift values.

Temporary arrays are another form of storage. A temporary array is usually aligned and distributed in the same manner as one of the user variables; that is the HPF program could be written in such a way that none of these temporaries would be needed. The algorithm used by the compiler to determine distribution of temporaries takes the statement in which the temporary is used into account. Temporaries are allocated before the statement in which they are used, and deallocated immediately after that statement. For example, an array assignment like:


	         REAL  A(N), B(N), C(N), D(N)
                 A = SUM(B, DIM=1) + MATMUL(C,D)

would result in the following:


                 allocate (tmp$b)
                 allocate (tmp$r)
                 call sum(tmp$b, b, 1)
                 call matmul(tmp$r, c, d)
                 a = tmp$b + tmp$r
                 deallocate(tmp$b)
                 deallocate(tmp$r)

For this class of temporaries, distribution of a temporary is determined depending on how the temporary is used. If a temporary is used as the argument to an intrinsic, the compiler tries to determine its distribution based on the other intrinsic arguments. Otherwise, it tries to assign a distribution based on the value assigned to the temporary. Otherwise, the temporary is replicated.

The above algorithm is very simple and is certainly not optimal. However, it is not clear what algorithm would perform better. Numerous factors, including array alignment, array distribution, array subsection usage, and argument usage need to be taken into account in determining temporary distribution. For example, consider the following case:


                 A(1:m:3) = SUM(B(1:n:2,:) + C(:,1:n:2), dim=2)

The section of A is passed directly to the SUM intrinsic to receive the result. A temporary is needed to compute the argument to SUM. The distribution of that temporary has two possibly conflicting goals: minimize communication in the expression, or minimize communication in the SUM computation and assignment to A.



Next: Optimizations Up: Communication Model Previous: Run-time Support System


zbozkus@
Thu Jul 6 21:09:19 EDT 1995