Given by Geoffrey C. Fox at Delivered Lectures of CPS615 Basic Simulation Track for Computational Science on 1 October 96. Foils prepared 27 December 1996
Outside Index
Summary of Material
Secs 64.8
This continues the discussion of HPF in the area of distribution and ALIGN statements. |
The discussion of ALIGN should be improved as audio makes dubious statements about "broadcasting" information. |
The distribution discussion includes a reasonable descriuption of block and cyclic and when you should use them. |
Outside Index
Summary of Material
Geoffrey Fox |
NPAC |
Room 3-131 CST |
111 College Place |
Syracuse NY 13244-4100 |
This continues the discussion of HPF in the area of distribution and ALIGN statements. |
The discussion of ALIGN should be improved as audio makes dubious statements about "broadcasting" information. |
The distribution discussion includes a reasonable descriuption of block and cyclic and when you should use them. |
Ranks of the alignee and the align-target may be different |
Examples:
|
... or other way round
|
while this only puts A on some parts of template... |
!HPF$ ALIGN A(:) WITH TEMPL(:,i) |
HPF allows for more general alignments such as:
|
!HPF$ TEMPLATE T(12,12) |
!HPF$ ALIGN A(:,J) WITH T(:,J+1) |
!HPF$ ALIGN B(I,J) WITH T(I+4,J+4) |
But nobody is clear if they are useful! |
Each align-dummy variable is considered to range over all valid index values for the corresponding dimension of the alignee. An align-subscript is evaluated for any specific combination of values for the align-dummy variables simply by evaluating each align-subscript as a expression. Their resulting subscript values must be legitimate subscripts for the align-target |
These examples have non-unit stride as perhaps in "red-black" Iterative Solver algorithms: |
Syntax: |
!HPF$ DISTRIBUTE distributee (dist-format) |
[ONTO dist-target] |
Allowed forms of dist-format:
|
Examples:
|
!HPF$ PROCESSORS P(4)
|
!HPF$ TEMPLATE T(16) |
!HPF$ ALIGN A(:) WITH T(:) |
*HPF PROCESSORS SQUARE(2,2) |
*HPF TEMPLATE T(4,4) |
*HPF ALIGN A(:,:) WITH T(:,:) |
*HPF DISTRIBUTE T(BLOCK,CYCLIC)ONTO SQUARE |
We used BLOCK in the Laplace equation example and so this is appropriate distribution for "local" or geometric type problems |
CYCLIC is called scattered in our early work (or is a special case of scattered which is perhaps random distribution of objects on processors) is appropriate in cases where "load-balancing" is more important than locality
|
Matrix Inversion set up on two processors after |
0 2 and 4 rows/columns eliminated |
Note BLOCK decomposition leads to all work being on one processor at end even if starts off balanced |
Here we show a 16 by 16 array of pixels with either CYCLIC or 8 by 8 two dimensional BLOCK,BLOCK |
CHPF$ PROCESSORS Q(4) |
CHPF$ TEMPLATE FRED(16,16) |
CHPF$ ALIGN A(:,:) WITH FRED(:,:) |
CHPF$ ALIGN B(I,J) WITH FRED(I+2,J+2) |
CHPF$ DISTRIBUTE FRED(BLOCK,*) |
One data mapping is often not appropriate for an entire program
|
ALLOCATABLE arrays can change size |
REALIGN and REDISTRIBUTE are executable DISTRIBUTE and ALIGN commands but are only to be used if one declares arrays on which they act DYNAMIC |
Naturally DYNAMIC arrays can be initialized by ALIGN or DISTRIBUTE statements |
This example illustrates remapping from one to two dimensional decomposition for A and changing B from alignment with columns to alignment with rows
|
!HPF$ PROCESSORS P(64) |
!HPF$ PROCESSORS Q(8,8) |
!HPF$ DYNAMIC :: A,B |
!HPF$ ALIGN B(:) WITH A(:,*) |
!HPF$ DISTRIBUTE A(*,BLOCK)ONTO P
|
!HPF$ REALIGN B(:) WITH A(*,:)
|
!HPF$ REDISTRIBUTE A(CYCLIC,CYCLIC) ONTO Q
|
!HPF$ PROCESSORS Q(64) |
!HPF$ ALIGN B(I) WITH A(I+N) |
!HPF$ DISTRIBUTE A(BLOCK(M)) |
!HPF$ DISTRIBUTE(BLOCK), DYNAMIC :: P
|
!HPF$ REDISTRIBUTE P(CYCLIC)
|
Scope of any mapping directives is a single (sub)program unit |
A template or distribution is not a first-class Fortran 90 object: |
It cannot be passed as a subprogram argument and this creates significant complication! |
There are three typical cases: |
Subroutine requires data to use a particular mapping determined by subroutine
|
Subroutine can use any mapping so actual argument should be passed and used with current mapping
|
Sometimes we need to remap due to array sections being passed |
Any remappings must be undone on return from subroutine |
DISTRIBUTE
|
ALIGN
|
INHERIT
|
(not a comprehensive discussion; just an example) |
PROCESSORS |
TEMPLATE |
ALIGN |
DISTRIBUTE |
INHERIT |
DYNAMIC |
REALIGN |
REDISTRIBUTE |
An operation on two or more data object is likely to be carried out much faster if they all reside in the same processor
|
it may be possible to carry out many such operations concurrently if they can be performed on different processors
|
Parallel Statements
|
Parallel Constructs
|
Intrinsic functions and the HPF library |
Extrinsic functions |
This is as in CMFortran and Maspar MPFortran with example: |
This is as in CMFortran and Maspar MPFortran with example:
|
Semantics of WHERE statement:
|