From haupt@nova.npac.syr.edu Mon May  2 22:24:54 1994
Date: Mon, 2 May 94 21:57:30 EDT
From: Tomasz Haupt <haupt@nova.npac.syr.edu>
To: paulc@nova.npac.syr.edu
Cc: haupt@nova.npac.syr.edu, gcf@nova.npac.syr.edu
Subject: HPF data mapping


Paul,
Probably this is too long. Take what you fill will best suite parper you are
writting.

Tom

-------------------------------------------------------------------------

Data Mapping

HPF data alignment and distribution directives allow the programmer to
advise the compiler how to assign data object (typically array
elements) to processors' memories. The model is that there is a
two-level mapping of data objects to memory regions, referred to as
"abstract processors": arrays are first aligned relative to one
another, and then this group of arrays is distributed onto a user
defined, rectilinear arrangement of abstract processors. The final
mapping, abstract to physical processors is not specified by HPF and
it is language-processor dependent.  The alignment itself is logically
accomplished in two steps. First, the index space spanned by an array
that serves as an align target defines a natural template of the
array. Then, an alignee is associated with this template. In addition,
HPF allows users to declare a template explicitly; this is particular
convenient when aligning arrays of different size and/or different
shape.  It is the template (either a natural or explicit one) that is
distributed onto abstract processors. This means, that all arrays'
elements aligned with an element of the template are mapped to the
same processor. This way locality of data is forced.  Arrays and other
data object that are not explicitly distributed using the compiler
directives are mapped according to an implementation dependent default
distribution. One possible choice of the default distribution is
replication: each processor is given its own copy of the data.

The data mapping can be declared using declarative directives:
PROCESSORS, ALIGN, DISTRIBUTE, and, optionally, TEMPLATE. In addition,
arrays may be remapped during the runtime. To this end, array must be
declared using DYNAMIC directive, and the actual remapping is
triggered by executable directives REALIGN and REDISTRIBUTE.

It is important to notice that the template is not a fist-class
Fortran 90 object, in the sense that it cannot be passed to a
subprogram as an argument. As a consequence, a distributed array
passed to a subprogram is aligned either to the natural template of
the actual argument or it is aligned to the user defined template. In
both cases it may lead to a runtime, implicit remapping of the array.
To allow more efficient implementations, in particular when the
mapping of the actual argument is known at the compile time, HPF
provides a directive INHERIT that specifies that a dummy argument
should be aligned to a copy of the template of the corresponding
actual argument in the same way the actual argument is aligned. In
addition, user may use a special syntax of the ALIGN and DISTRIBUTE
directives (with stars preceding the align and/or distribute
attributes) that serve as assertion rather than declaration of the
mapping of the dummy argument.

In HPF, arrays may be aligned one with another in many ways. The
repertoire includes shifts, strides, or any other linear combination
of a subscript (i.e., n*i + m), transposition of indices, and collapse
or replication of array's dimensions. Skewed or irregular alignments
are, however, not allowed. The template may be distributed in BLOCK,
CYCLIC, BLOCK(n), and CYCLIC(n) fashion. In addition, any dimension of
the template may be collapsed or replicated onto a processor grid
(note, that it does not change the relative alignment of the arrays!).
The BLOCK distribution specifies that the template should be
distributed across set of abstract processors by slicing it uniformly
into blocks of contiguous elements. The BLOCK(n) distribution
specifies that groups of exactly n elements should be mapped to
successive abstract processors, and there must be at least (array
size)/n abstract processors if the directive is to be satisfied. The
CYCLIC(n) distribution specifies that successive array elements'
blocks of size n are to be dealt out to successive abstract processors
in round-robin fashion. Finally, CYCLIC distribution is equivalent to
the CYCLIC(1) distribution.