Full HTML for

Basic foilset Overview of Fortran 90 and HPF Fall 96

Given by Geoffrey C. Fox, Tom Haupt at Basic Simulation Track for Computational Science CPS615 on Fall Semester 96. Foils prepared 17 Sept 1996
Outside Index Summary of Material


A brief discussion of Fortran90 and Fortran77 and why Fortran90 has advantages and disadvantages
Overview of Key Features of Fortran90
See Metcalf and Reid, Fortran90 Explained, Oxford Scientific Publications
Overview of Key Features of HPF
  • Parallel Constructs
  • Data Mapping
  • Examples
The Future -- HPF2
See Chuck Koelbel from Rice University at
http://renoir.csc.ncsu.edu/MRA/HTML/Workshop2/Koelbel

Table of Contents for full HTML of Overview of Fortran 90 and HPF Fall 96

Denote Foils where Image Critical
Denote Foils where HTML is sufficient

1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to High Performance Fortran and Fortran 90

2 Abstract of HPF and Fortran90 Technology Discussion
3 HPF is an extension of Fortran 90
4 Why is Fortran90 Easier than Fortran77
5 Important Features of Fortran90
6 Introduction to Fortran90 Arrays - I
7 Introduction to Fortran90 Arrays - II
8 Fortran90 Arrays and Memory Allocation
9 More on Fortran90 Arrays and Subroutines
10 Typical Use of Array and Intrinsic Operations
11 Derived Type in Fortran90
12 Examples of POINTER's in Fortran90
13 MODULEs in Fortran90
14 MODULEs INTERFACES and Overloaded Operators in Fortran90
15 Outline of HPF Discussion
16 Information on HPF and HPF Forum (HPFF)
17 Possible Programming Models
18 Data Parallel Programming Model
19 Parallelism in HPF
20 Fortran77 is part of Fortran90
21 HPF Features
22 What gives high performance in HPF
23 Compiler directives used in HPF
24 What does an HPF Compiler do?
25 Syntax of HPF Directives
26 Data Mapping in HPF
27 Staged Data Mapping in HPF
28 Template in HPF
29 Abstract Processors in HPF
30 Example of Template and Processors
31 Align Directive in HPF
32 Examples of Align Directive
33 Changing Rank in Align Directive
34 Replication in Align Directive
35 General Alignments in HPF
36 Formal Definition of Align Directive
37 More obscure Complicated Examples of Align Directive
38 Distribution Directive in HPF
39 Basic Examples of Distribute Directive
40 Two Dimensional Example of Distribute Directive
41 The Two Basic Distributions in HPF
42 The Example of Matrix Inversion
43 Example of Graphics Rendering
44 Example of Distribute Directive with Complex Alignment
45 Dynamic Data Mapping
46 Advanced Mapping Directives -- ReDistribution and ReAlign
47 Advanced Mapping Directives -- Allocatable arrays and pointers
48 Subprograms in HPF
49 Passing Distributed Arrays as Subprogram Arguments in HPF
50 Mapping Options for Dummy (Subroutine) Arguments
51 Inherit Distribution Directive in HPF
52 Summary of Mapping Directives in HPF
53 Fundamental Parallelism Assumption in HPF
54 Parallel statements and Constructs in HPF
55 Parallelism in Fortran 90 array assignments
56 WHERE (masked array assignment) in HPF
57 WHERE...ELSEWHERE / IF...ELSE constructs in HPF
58 Intrinsic functions in HPF
59 HPF library functions
60 SUM, SUM_PREFIX and SUM_SCATTER defined
61 HPF Intrinsic EXAMPLE: SUM
62 FORALL Statement in HPF
63 Examples of FORALL statements in HPF
64 Semantics of the FORALL statement in HPF
65 Vector Indices in FORALL's
66 Multiple Statement FORALL's
67 HPF FORALL construct Pictorially
68 PURE Functions in HPF
69 Example of PURE Function from Chuck Koelbel
70 The INDEPENDENT Assertion in HPF
71 !HPF$ INDEPENDENT FORALL Pictorially
72 !HPF$ INDEPENDENT DO Pictorially
73 !HPF$ INDEPENDENT, NEW Variable
74 Extrinsics in HPF
75 High Performance Fortran HPF2 Changes
76 ON HOME for Computation Placement
77 Reductions in INDEPENDENT DO Loops
78 Spawning Tasks in HPF
79 New Data Mapping Features in HPF 2.0 - I
80 New Data Mapping Features in HPF 2.0 - II

Outside Index Summary of Material



HTML version of Basic Foils prepared 17 Sept 1996

Foil 1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to High Performance Fortran and Fortran 90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Geoffrey Fox, Tom Haupt
NPAC
Room 3-131 CST
111 College Place
Syracuse NY 13244-4100

HTML version of Basic Foils prepared 17 Sept 1996

Foil 2 Abstract of HPF and Fortran90 Technology Discussion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
A brief discussion of Fortran90 and Fortran77 and why Fortran90 has advantages and disadvantages
Overview of Key Features of Fortran90
See Metcalf and Reid, Fortran90 Explained, Oxford Scientific Publications
Overview of Key Features of HPF
  • Parallel Constructs
  • Data Mapping
  • Examples
The Future -- HPF2
See Chuck Koelbel from Rice University at
http://renoir.csc.ncsu.edu/MRA/HTML/Workshop2/Koelbel

HTML version of Basic Foils prepared 17 Sept 1996

Foil 3 HPF is an extension of Fortran 90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
To express data parallelism and so hide machine dependent features of parallel programming from the user
We use Fortran90 as base language both because it is
  • more advanced than Fortran 77 with object oriented features and
  • The array syntax allows an elegant explictly parallel expression of some operations
Use of Fortran90 is a Problem because
  • It is a complex language which it is difficult to build compilers for
  • Not many people use Fortran90 and maybe they will all switch to Java before Fortran90 gets in common practice
  • So perhaps it is "too little too late" and we should focus on supporting the past (Fortran77) and the "correct" future ( (HP)Java) and not an irrelevant middle solution .....

HTML version of Basic Foils prepared 17 Sept 1996

Foil 4 Why is Fortran90 Easier than Fortran77

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
1 A(I)=B(I) is obviously parallel
Fortran90 Array Notation
  • A=B expresses this parallelism naturally in way compiler can easily detect in a deterministic way
  • Do 1 I=1,N
  • J=I
  • K=I
1 A(J)=B(K) is not so obviously parallel
Needs a difficult to define (in general case) algorithm (especially if IF statements, i.e. conditionals, defining I,J) to decide on existence and implementation of parallelism
Use of Fortran77 has "thrown away" natural parallelism at language level even though "run-time" restores as creates explicit values for variables such as J and K which are only known by analysis at compile time.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 5 Important Features of Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Arrays are very well supported with memory allocation and set of intrinsics, better passing to procedures etc.
  • This is key capability for HPF
Derived Types allow general object structure (without inheritance) in F90
  • Pointers
  • This area NOT well supported by HPF Compilers as of Summer 1996
Modules replace COMMON INCLUDE etc.
Procedures (functions,subroutines) allow better interfaces, recursion, optional parameters etc.
Better Syntax with free form, more loop control etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 6 Introduction to Fortran90 Arrays - I

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Arrays are "true" objects in Fortran90 and are stored as values of elements plus a data descriptor!
There are operations on full arrays which have natural parallel implementations seen in HPF
New set of array intrinsic (built-in) functions for elements, reductions (process array into a single value), tranformational (reshaping)
  • Note these functions are NOT sufficient for real problems and must have FORALL to define new parallel Array functions
  • FORALL is in HPF but not F90 -- Expected in next round of standard (Fortran95)
  • FORALL( I=0:nx)
    • A(i) = (A(I+1)+A(I-1))/(I+1)
  • END FORALL

HTML version of Basic Foils prepared 17 Sept 1996

Foil 7 Introduction to Fortran90 Arrays - II

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Extract sections (subarrays) of arrays as u(lb:ub:step)
  • lb is Lower Bound
  • ub is Upper Bound
  • step (defaults to 1) is step
masked (conditional) Array operations using WHERE .... ELSEWHERE
Can still do Fortran77 array element operations (DO Loops) but of course this might not be interpretable for efficient parallelism by HPF compiler
Note Fortran90 designed for science and engineering with originally special concern for vector supercomputers but Cray supports F77 better than F90(!)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 8 Fortran90 Arrays and Memory Allocation

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
ALLOCATABLE Arrays can be defined at runtime with variable sizing
  • REAL, ALLOCATABLE :: u(:,:) , f(:,:)
  • ALLOCATE ( u(0:nx,0:ny) , f(1:27,0:ny) )
One can define POINTER and TARGET attributes which can be used like REAL, DIMENSION etc.
  • => operator allows one to set a POINTER to "point to" a TARGET
Arguments of a subroutine need NOT define array dimensions in subroutine as these as passed by calling program in data descriptor
Local arrays are created on stack and bounds maybe non constant and evaluated at procedure entry

HTML version of Basic Foils prepared 17 Sept 1996

Foil 9 More on Fortran90 Arrays and Subroutines

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
One passes "assumed-shape" arrays from calling to callee routines using INTERFACE syntax
INTERFACE
  • SUBROUTINE residual (r,u,f)
    • REAL r(:,:) , u(:,:) , F(:,:)
  • END SUBROUTINE
END INTERFACE is called by
call residual (r,u,f) or
call residual ( r(0:nx:2, 0:ny:2) , u(0:nx:2, 0:ny:2) , f(0:nx:2, 0:ny:2) )
where latter example just processes every other element of arrays

HTML version of Basic Foils prepared 17 Sept 1996

Foil 10 Typical Use of Array and Intrinsic Operations

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
REAL u(0:nx,0:ny), A(100,100) , fact , avg
u= fact * (u -avg) Scales and translates all elements of u
avg = .25*( CSHIFT(u,1,1) + CSHIFT(u,-1,1) + CSHIFT(u,1,2) + CSHIFT(u,-1,2)
calculates of average of 4 array elements surrounding each point. Note third argument in CSHIFT is label for axis (1=x 2=y)
SQRT( A(1:100) ) calculates a new array containing 100 square roots
SUM(A) is a reduction operator sumimg all elements of array A as a scalar
SIZE(A,1) is an Array Query Intrinsic giving size of A in the first dimension and is particularly useful for "assumed-shape" arrays passed into subroutines

HTML version of Basic Foils prepared 17 Sept 1996

Foil 11 Derived Type in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
TYPE PERSON
  • CHARACTER(LEN=10) NAME
  • REAL AGE
  • INTEGER ID
END TYPE PERSON
TYPE(PERSON) YOU,ME
The Identification number of YOU would be accessed as YOU%ID as an ordinary integer
One can define global operators so that YOU+ME could be defined
One can use name of derived type as a constructor
YOU = PERSON ('Pamela Fox', 12, 3)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 12 Examples of POINTER's in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
One can define a linked list as:
TYPE ENTRY
  • REAL VALUE
  • INTEGER INDEX
  • TYPE(ENTRY), POINTER :: NEXT
END TYPE ENTRY
ALLOCATE Creates dynamically elements in a linked list
CURRENT = ENTRY( NEW_VALUE, NEW_INDEX, FIRST)
FIRST => CURRENT
adds a new entry at start of linked list and renames it with POINTER FIRST

HTML version of Basic Foils prepared 17 Sept 1996

Foil 13 MODULEs in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
General Syntax is:
MODULE name
  • Specify it!
CONTAINS This is optional
  • module subprograms
END MODULE name
MODULE IllustratingCommonBlock
  • INTEGER DIMENSION(52) :: CARDS
END MODULE IllustratingCommonBlock
replaces COMMON construct and can be used as
USE IllustratingCommonBlock

HTML version of Basic Foils prepared 17 Sept 1996

Foil 14 MODULEs INTERFACES and Overloaded Operators in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
MODULE INTERVAL_ARITHMETIC
  • TYPE INTERVAL
    • REAL LOWER, UPPER
  • END TYPE INTERVAL
  • INTERFACE OPERATOR(+) define overloaded + operator
    • MODULE PROCEDURE ADD_INTERVALS
  • END INTERFACE
CONTAINS
  • FUNCTION ADD_INTERVALS(A,B)
    • TYPE(INTERVAL) ADD_INTERVALS, A, B
    • ADD_INTERVALS%LOWER = A%LOWER + B%LOWER
    • ADD_INTERVALS%UPPER = A%UPPER + B%UPPER
  • END FUNCTION ADD_INTERVALS(A,B)
END MODULE INTERVAL_ARITHMETIC

HTML version of Basic Foils prepared 17 Sept 1996

Foil 15 Outline of HPF Discussion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
What is HPF, what we need it for, where it came from
How does HPF Get its Parallelism
Why it is called "High Performance"?
What are HPF compiler directives
Data mapping in HPF
  • TEMPLATE PROCESSORS DISTRIBUTE ALIGN
Parallel statements and constructs in HPF
  • Array statements, WHERE/ELSEWHERE, Intrinsics, FORALL, PURE, INDEPENDENT
Latest Discussions -- HPF-2
  • ON HOME, TASKING, Dynamic Data Mapping, Reductions in INDEPENDENT DO loops

HTML version of Basic Foils prepared 17 Sept 1996

Foil 16 Information on HPF and HPF Forum (HPFF)

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Rice has taken lead in HPF Forum which is a much faster mechanism of getting agreement than formal 10 year process which Fortran90 suffered
World Wide Web at rice and Vienna
  • http://www.crpc.rice.edu/HPFF/home.html
  • http://www.vcpc.univie.ac.at/HPFF/home.html
Mailing List is majordomo@cs.rice.edu and choose list (hpff, hpff-interpret, hpff-core) you wish to subscribe to
Anonymous FTP to titan.cs.rice.edu and look at
  • public/HPFF/README for latest list of files

HTML version of Basic Foils prepared 17 Sept 1996

Foil 17 Possible Programming Models

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Explicit Message Passing as in PVM or MPI
User breaks program into parts and the parts send messages between them to implement communication necessary for synchronization and integration of parts into solution of a single program
This matches hardware but is not particularly natural for problem and can be machine dependent
Object Oriented programming is like message passing but now objects and not programs communicate
  • Very good when objects are natural from the problem and represent functional parallelism
  • However in data parallel problems tackled with object oriented approach, one must break problem up into a number of objects that depends on number of processors and so reflects machine and not problem

HTML version of Basic Foils prepared 17 Sept 1996

Foil 18 Data Parallel Programming Model

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Data Parallelism is higher level than either message passing or object models (if objects used to break up data to respect computer)
It provides a Shared Memory Programming Model which can be executed on SIMD or MIMD computers, distributed or shared memory computers
Note it specifies problem not machine structure
It in principle provides the most attractive machine independent model for programmers as it reflects problem and not computer
Its disadvantage is that hard to build compilers especially for the most interesting new algorithms which are dynamic and irregular!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 19 Parallelism in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Parallelism in HPF is expressed explicitly
  • Fortran 90 array expressions and assignments (including WHERE)
  • HPF Library and Array intrinsics
  • FORALL statement and construct
    • PURE labels procedures that can be used in FORALL as they have no "side-effects"
  • INDEPENDENT assertion on DO loops
Compiler may choose not to exploit information about parallelism
Compiler may detect parallelism in sequential code

HTML version of Basic Foils prepared 17 Sept 1996

Foil 20 Fortran77 is part of Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
A=B or more interestingly
WHERE( B > 0. ) A = B
ELSEWHERE A=0.
END WHERE
can be written
DO I = n1,n2
DO J = m1,m2
IF(B(I,J) >0.) THEN A(I,J) = B(I,J)
ELSE A(i,J) = 0.
END IF
END DO
END DO
Now a good HPF compiler will recognize the DO loops can be parallelized and give the same answer for Fortran90 and Fortran77 forms but often the detection of parallelism is not clear
Note FORALL is guaranteed to be parallelizeable as by definition no side effects.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 21 HPF Features

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
All of Fortran90
New instructions FORALL and INDEPENDENT enhancing DO loops
Data Alignment and Distribution Assertions
Miscellaneous Support Operations but
NO parallel Input/Output
Little Support for Irregular Computations
Little Support for any form of non mainstream data-parallelism
Extrinsics as supporting links with explicit message-passing

HTML version of Basic Foils prepared 17 Sept 1996

Foil 22 What gives high performance in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
There is tradeoff between parallelism and communication
Programmer defines the data mapping and compiler uses this to assign processing
Underlying assumptions are that:
An operation on two or more data object is likely to be carried out much faster if they all reside in the same processor,
And that it may be possible to carry out many such operations concurrently if they can be performed on different processors
This is embodied in "owner computes" rule -- namely that in for instance
  • A(i,j)= .....
  • One brings everything on right hand side to process "owning" A(i,j) and performs computation in this processor
Owner computes algorithm is usually good and often best

HTML version of Basic Foils prepared 17 Sept 1996

Foil 23 Compiler directives used in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
The directives are structured comments that suggest implementation strategies or assert facts about a program to the compiler
They may affect the efficiency of the computation performed, but do not change the value computed by the program
As in Fortran 90 statements, there are both:
  • declarative directives
  • executable directives

HTML version of Basic Foils prepared 17 Sept 1996

Foil 24 What does an HPF Compiler do?

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
It must generate Fortran77(90) + Message Passing code or possibly in one pass map HPF code onto parallel machine code
Traditional dataflow and dependency analysis is especially critical in Fortran77 parts of code
It must use data mapping assertions to decide what is stored where and so organize computation
Code must be transformed to respect this owner-computes model
It must typically use "Loosely Synchronous" model with communicate-compute phases and then compiler generates all the communication needed
  • Due to latency issues compiler must minimize communication needed and maximize size of packets sent
We need an excellent run-time library which the compiler invokes with parallel Intrinsics etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 25 Syntax of HPF Directives

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
HPF directives are consistent with Fortran 90 syntax except for the special prefix for directive:
  • !HPF$ (only version allowed with free format)
  • CHPF$
  • *HPF$
Two forms of the directives are allowed
  • Specification statements, such as
  • !HPF$ DISTRIBUTE MYTEMPLATE(BLOCK) ONTO P
  • Equivalent Attributed form, such as,
  • !HPF$ DISTRIBUTE (BLOCK) ONTO P :: MYTEMPLATE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 26 Data Mapping in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Data Mapping in HPF is all you need to do to get parallelism as long as you use the explicit array type syntax such as A=B+C
The Owner Computes rule implies that specifying location of variables specifies (optimally or not) parallel execution!
The new HPF-2 ON HOME directive is exception to this rule as specifies where a particular statement is to be executed
(RE)DISTRIBUTE tells you where data is to be placed
(RE)ALIGN tells you how different data structures are to be placed relative to each other

HTML version of Basic Foils prepared 17 Sept 1996

Foil 27 Staged Data Mapping in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 28 Template in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
A Template is an abstract space of indexed positions (an "array of nothings")
In CMFortran terminology, Template is set of Virtual Processors -- one per data point
  • Template typically specifies precisely the full natural data parallelism that is natural for problem and HPF maps this problem parallelism onto particular machine
A template is declared by the TEMPLATE directive that specifies:
  • name of the template
  • the rank (i.e., number of dimensions)
  • the extent in each dimension
Examples:
  • CHPF$ TEMPLATE T(1000)
  • !HPF$ TEMPLATE FRED(N, 2*N)
  • *HPF$ TEMPLATE, DIMENSION(5,100,50) :: MINE, YOURS

HTML version of Basic Foils prepared 17 Sept 1996

Foil 29 Abstract Processors in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Abstract processors always form a rectilinear grid in 1 or more dimensions
They are abstract coarse grain collections of data-points
  • Remember efficiency says we must "block" communication into large chunks -- processors give us a general target for this
The processor arrangement is defined by the PROCESSORS directive that specifies:
  • name of the processor arrangement
  • the rank (i.e., number of dimensions)
  • the extend in each dimension
Examples:
  • !HPF$ PROCESSORS P(N)
  • *HPF$ PROCESSORS BIZARRO(1972:1997,-20:17)
  • CHPF$ PROCESSORS SCALARPROC (by default sequential)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 30 Example of Template and Processors

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF$ PROCESSORS P(4)
!HPF$ TEMPLATE X(40)
!HPF$ ALIGN WITH X :: A, B, C
!HPF$ DISTRIBUTE X(BLOCK)
  • ...
  • C = A + B
  • ...

HTML version of Basic Foils prepared 17 Sept 1996

Foil 31 Align Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Syntax of Align:
!HPF ALIGN alignee WITH align-target
  • where -- note [..] implies optional component
  • alignee: alignee [(align-source-list)]
  • align target: align-target[(align-subscript-list)]
Alternatively
*HPF ALIGN (align-source-list) WITH align-target :: alignee

HTML version of Basic Foils prepared 17 Sept 1996

Foil 32 Examples of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Note a colon(:) in directive denotes all values of array index
Examples of array indices:
  • CHPF$ ALIGN A(i) WITH B(i)
  • *HPF$ ALIGN (i,j) WITH TEMPL(i,j) :: A, B
Use of : examples:
  • !HPF$ ALIGN A(:) WITH B(:)
  • CHPF$ align (:,:) WITH TEMPL(:,:) :: A, B

HTML version of Basic Foils prepared 17 Sept 1996

Foil 33 Changing Rank in Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Ranks of the alignee and the align-target may be different
Examples:
  • !HPF$ ALIGN A(:,j) WITH B(:)
  • CHPF$ ALIGN A(:,*) WITH B(:)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 34 Replication in Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
... or other way round
  • !HPF$ ALIGN A(:) WITH TEMPL(:,*)
while this only puts A on some parts of template...
!HPF$ ALIGN A(:) WITH TEMPL(:,i)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 35 General Alignments in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
HPF allows for more general alignments such as:
  • REAL, DIMENSION(5,8) :: A,B
!HPF$ TEMPLATE T(12,12)
!HPF$ ALIGN A(:,J) WITH T(:,J+1)
!HPF$ ALIGN B(I,J) WITH T(I+4,J+4)
Useful for simple numerical shifts as in example but not useful
in general case of arbitary
index values allowed by
ALIGN syntax

HTML version of Basic Foils prepared 17 Sept 1996

Foil 36 Formal Definition of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Each align-dummy variable is considered to range over all valid index values for the corresponding dimension of the alignee. An align-subscript is evaluated for any specific combination of values for the align-dummy variables simply by evaluating each align-subscript as a expression. Their resulting subscript values must be legitimate subscripts for the align-target

HTML version of Basic Foils prepared 17 Sept 1996

Foil 37 More obscure Complicated Examples of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
These examples have non-unit stride as perhaps in "red-black" Iterative Solver algorithms:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 38 Distribution Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Syntax:
!HPF$ DISTRIBUTE distributee (dist-format)
[ONTO dist-target]
Allowed forms of dist-format:
  • * -- Implies no distribution in this index
  • BLOCK -- Critical to minimize communication
  • CYCLIC -- Critical for load balancing
  • BLOCK(int-expr) -- Not Obviously useful!
  • CYCLIC(int-expr) -- Very useful
Examples:
  • CHPF$ DISTRIBUTE TEMP(BLOCK,CYCLIC)
  • !HPF$ DISTRIBUTE FRED(BLOCK(10)) ONTO P
  • *HPF$ DISTRIBUTE (BLOCK,*) :: MYTEMPLATE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 39 Basic Examples of Distribute Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF$ PROCESSORS P(4)
  • REAL, DIMENSION(16) :: A
!HPF$ TEMPLATE T(16)
!HPF$ ALIGN A(:) WITH T(:)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 40 Two Dimensional Example of Distribute Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
*HPF PROCESSORS SQUARE(2,2)
*HPF TEMPLATE T(4,4)
*HPF ALIGN A(:,:) WITH T(:,:)
*HPF DISTRIBUTE T(BLOCK,CYCLIC)ONTO SQUARE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 41 The Two Basic Distributions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
We used BLOCK in the Laplace equation example and so this is appropriate distribution for "local" or geometric type problems
CYCLIC is called scattered in our early work (or is a special case of scattered which is perhaps random distribution of objects on processors) is appropriate in cases where "load-balancing" is more important than locality
  • Simplest examples are matrix inversion and graphics rendering problems
  • In solving equations (we will do later) Ax=b , there is no "nearest neighbor" structure between rows and columns, but rather one eliminates rows and columns and cyclic distribution ensures work remains balanced
  • In calculating pixels, work depends on complexity of picture at that pixel and so best to distribute pixels cyclically (or randomly) to processors.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 42 The Example of Matrix Inversion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Matrix Inversion set up on two processors after
0 2 and 4 rows/columns eliminated
Note BLOCK decomposition leads to all work being on one processor at end even if starts off balanced

HTML version of Basic Foils prepared 17 Sept 1996

Foil 43 Example of Graphics Rendering

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Here we show a 16 by 16 array of pixels with either CYCLIC or 8 by 8 two dimensional BLOCK,BLOCK

HTML version of Basic Foils prepared 17 Sept 1996

Foil 44 Example of Distribute Directive with Complex Alignment

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
CHPF$ PROCESSORS Q(4)
CHPF$ TEMPLATE FRED(16,16)
CHPF$ ALIGN A(:,:) WITH FRED(:,:)
CHPF$ ALIGN B(I,J) WITH FRED(I+2,J+2)
CHPF$ DISTRIBUTE FRED(BLOCK,*)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 45 Dynamic Data Mapping

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
One data mapping is often not appropriate for an entire program
  • Often one has phases in which different distributions are needed in different phases
  • e.g. in 2D FFT, one typically finds FFT of F(I,J) by first distributing so for each J all I (x values) are in same processor and then transform so that for each I all J are in same processor
  • This ensures no communication in FFT phases which is important as typically in distributed one dimensional FFT there is substantial overhead
ALLOCATABLE arrays can change size
REALIGN and REDISTRIBUTE are executable DISTRIBUTE and ALIGN commands but are only to be used if one declares arrays on which they act DYNAMIC
Naturally DYNAMIC arrays can be initialized by ALIGN or DISTRIBUTE statements

HTML version of Basic Foils prepared 17 Sept 1996

Foil 46 Advanced Mapping Directives -- ReDistribution and ReAlign

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
This example illustrates remapping from one to two dimensional decomposition for A and changing B from alignment with columns to alignment with rows
  • REAL, DIMENSION(64,64) :: A
  • REAL, DIMENSION(64) :: B
!HPF$ PROCESSORS P(64)
!HPF$ PROCESSORS Q(8,8)
!HPF$ DYNAMIC :: A,B
!HPF$ ALIGN B(:) WITH A(:,*)
!HPF$ DISTRIBUTE A(*,BLOCK)ONTO P
  • ...
!HPF$ REALIGN B(:) WITH A(*,:)
  • ...
!HPF$ REDISTRIBUTE A(CYCLIC,CYCLIC) ONTO Q
  • ...

HTML version of Basic Foils prepared 17 Sept 1996

Foil 47 Advanced Mapping Directives -- Allocatable arrays and pointers

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF$ PROCESSORS Q(64)
!HPF$ ALIGN B(I) WITH A(I+N)
!HPF$ DISTRIBUTE A(BLOCK(M))
!HPF$ DISTRIBUTE(BLOCK), DYNAMIC :: P
  • ...
  • ALLOCATE(A(128))
  • ALLOCATE(B(64))
  • ALLOCATE(P(1024))
  • ...
!HPF$ REDISTRIBUTE P(CYCLIC)
  • ...
  • RETURN
  • END

HTML version of Basic Foils prepared 17 Sept 1996

Foil 48 Subprograms in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Scope of any mapping directives is a single (sub)program unit
A template or distribution is not a first-class Fortran 90 object:
It cannot be passed as a subprogram argument and this creates significant complication!
HPF Compiler will typically pass an extra argument which is effectively an array-descriptor telling subroutine about distribution of passed arrays
One can use array query intrinsics to find out what is going on but of course compiler does this implicitly

HTML version of Basic Foils prepared 17 Sept 1996

Foil 49 Passing Distributed Arrays as Subprogram Arguments in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
There are three typical cases:
Subroutine requires data to use a particular mapping determined by subroutine
  • Arguments must be remapped
Subroutine can use any mapping so actual argument should be passed and used with current mapping
  • Here we have two cases depending on whether programmer knows or not (and tells subroutine) what incoming distribution is
Sometimes we need to remap due to array sections being passed
Any remappings must be undone on return from subroutine

HTML version of Basic Foils prepared 17 Sept 1996

Foil 50 Mapping Options for Dummy (Subroutine) Arguments

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
DISTRIBUTE
  • use * instead of dist-format or ONTO clause indicates that incoming distribution is acceptable i.e. leave data in place
  • * before dist-format or ONTO clause indicates that data should stay in place and asserts that distribution is what you claim
ALIGN
  • * instead of or before target has similar meanings to DISTRIBUTE
INHERIT
  • A new attribute allowing references back to the original full array and used when sections of array are passed

HTML version of Basic Foils prepared 17 Sept 1996

Foil 51 Inherit Distribution Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
(not a comprehensive discussion; just an example)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 52 Summary of Mapping Directives in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
PROCESSORS
TEMPLATE
ALIGN
DISTRIBUTE
INHERIT
DYNAMIC
REALIGN
REDISTRIBUTE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 53 Fundamental Parallelism Assumption in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
An operation on two or more data object is likely to be carried out much faster if they all reside in the same processor
  • i.e. minimize communication
it may be possible to carry out many such operations concurrently if they can be performed on different processors
  • data parallelism

HTML version of Basic Foils prepared 17 Sept 1996

Foil 54 Parallel statements and Constructs in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Parallel Statements
  • Fortran 90 array assignments
  • masked array assignments (WHERE)
  • FORALL statement
Parallel Constructs
  • WHERE and WHERE...ELSEWHERE construct
  • FORALL construct
  • INDEPENDENT DO
Intrinsic functions and the HPF library
Extrinsic functions

HTML version of Basic Foils prepared 17 Sept 1996

Foil 55 Parallelism in Fortran 90 array assignments

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
This is as in CMFortran and Maspar MPFortran with example:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 56 WHERE (masked array assignment) in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
This is as in CMFortran and Maspar MPFortran with example:
  • WHERE (A .GT. 0) A = A - 100
Semantics of WHERE statement:
    • 1. evaluate mask (in parallel) and store as a temporary T1
    • 2. for each i that T1(i)=.TRUE. compute T2(i)=A(i) - 100
    • 3. for each i that T1(i)=.TRUE. assign A(i)=T2(1)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 57 WHERE...ELSEWHERE / IF...ELSE constructs in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
There is a fundamental difference in semantics between IF...ELSE and WHERE...ELSEWHERE constructs

HTML version of Basic Foils prepared 17 Sept 1996

Foil 58 Intrinsic functions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
elemental
  • examples:
    • A = SIN(X)
    • FORALL (i=1:100:2) A(i) = EXP(A(i))
transformational and inquiry functions
  • Fortran 90
    • SUM, PRODUCT, ANY, DOTPROD, EOSHIFT, MAXVAL, ...
  • HPF
    • system inquiry functions:
    • NUMBER_OF_PROCESSORS,
    • PROCESSORS_SHAPE
    • extensions of MAXLOC and MINLOC, ILEN
  • HPF library

HTML version of Basic Foils prepared 17 Sept 1996

Foil 59 HPF library functions

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
new array reduction functions
  • IALL, IANY, IPARITY and PARITY
  • (IAND, IOR, IEOR, and .NEQV.)
array combining scatter functions
  • XXX_SCATTER
array prefix and suffix functions
  • XXX_PREFIX, XXX_SUFFIX
array sorting functions
  • GRADE_DOWN, GRADE_UP
bit manipulation functions
  • LEADZ, POPCNT, POPPAR
mapping inquiry subroutines
  • HPF_ALIGNMENT, HPF_TEMPLATE, HPF_DISTRIBUTION

HTML version of Basic Foils prepared 17 Sept 1996

Foil 60 SUM, SUM_PREFIX and SUM_SCATTER defined

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
X=SUM(A) sums all elements of A and places result in scalar X
Y = SUM_PREFIX(A) sets array Y of same size as A so that Y(i) has the sum of all A(j) for 1 <= j <= i
Y = SUM_SCATTER(A,B, IND) sets array element Y(i) as the sum of array element B(i) plus those elements of A(j) where IND(j) = I
  • X = SUM_SCATTER( flux, X, INDEX) is equivalent to
  • FORALL (i=1:N)
    • X(INDEX(i)) = X(INDEX(i)) + flux(i)
  • END FORALL assuming INDEX(i) just permutes numbers 1 to N and has no repeated values

HTML version of Basic Foils prepared 17 Sept 1996

Foil 61 HPF Intrinsic EXAMPLE: SUM

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 62 FORALL Statement in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
A very important extension to Fortran 90 and defines one class of parallel DO loop
FORALL will be a language feature of Fortran95
It relaxes the restriction that operands of the rhs expressions must be conformable with the lhs array
It may be masked with a scalar logical expression (extension of WHERE construct)
A FORALL statement may call user-defined (PURE) functions on the elements of an array, simulating Fortran 90 elemental function invocation (albeit with a different syntax)
FORALL( index-spec-list [,mask-expr] ) forall assignment
  • where forall-assignment is conventional single Fortran90 statement

HTML version of Basic Foils prepared 17 Sept 1996

Foil 63 Examples of FORALL statements in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
FORALL (i=1:100,k=1:100) a(i,k) = b(i,k) A = B
FORALL (i=2:100:2) a(i) = a(i-1) A(2:100:2) = A(1:99:2)
FORALL (i=1:100) a(i) = i A = [1..100]
FORALL (i=1:100, j=1:100) a(i, j) = i+j
FORALL (i=1,100) a(i,i) = b(i)
FORALL (i=1,100,j=1:100) a(i,j) = b(j,i)
FORALL (i=1,100) a(i, 1:100) = b(1:100, i)
FORALL (i=1:100, j=1:100, y(i,j).NE.0) x(i,j) = REAL(i+j)/y(i,j)
FORALL (i=1,100) a(i,ix(i)) = x(i)
FORALL (i=1,9) x(i) = SUM(x(1:10:i))
FORALL (i= 1,100) a(i) = myfunction(a(i+1))

HTML version of Basic Foils prepared 17 Sept 1996

Foil 64 Semantics of the FORALL statement in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Similar to Fortran 90 array assignments and WHERE
Consider example:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 65 Vector Indices in FORALL's

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Consider FORALL( i=1:n ) a(ix(i)) = a(i)
is allowed in HPF but will only give sensible reproducible results if ix(i) is a true permutation of 1...n
  • but if ix(i) has repeated values, then the result is undefined
Of course we always use "old" value" of a(i) on rhs so that if ix(i)= i+1 and a(0) defined, then result is
  • a(i)/new = a(i-1)/old
  • and not result of recursion
  • a(1)/new =a(0)/old
  • a(2)/new = a(1)/new just calculated = a(0)/old

HTML version of Basic Foils prepared 17 Sept 1996

Foil 66 Multiple Statement FORALL's

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
FORALL( index-spec-list [,mask-expr] )
forall-body
END FORALL
where forall-body can be a list of forall-assignment statements, FORALL or WHERE statements
So Multi-Statement FORALL's support nesting of FORALL's but
is in general Shorthand for a sequence of single statement FORALL's with by definition each statement completed before next one begins
The multi-statement FORALL is likely to be more efficient than several single statement ones as latter have synchronization overhead on each statement

HTML version of Basic Foils prepared 17 Sept 1996

Foil 67 HPF FORALL construct Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 68 PURE Functions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
PURE functions have no side effects
  • ALL Intrinsics are PURE
DO loops can call any functions and parallelism unclear as function call can destroy parallelism
  • DO I=1,1000
    • A(I) = FUNC(A(I-1),X)
  • END DO
If FUNC alters A(I-1) or in fact A(any index except I), then this loop cannot be easily parallelized
FORALL statements can only call PURE functions and these must NOT define any global (e.g. any element of A in example) or dummy (A(i-1) or X) variable
  • PURE functions can only INHERIT distribution and alignment statements

HTML version of Basic Foils prepared 17 Sept 1996

Foil 69 Example of PURE Function from Chuck Koelbel

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
FORALL( i=1:n, j=1:m )
  • k(i,j) = mandelbrot ( CMPLX((i-1)*1.0/(n-1), (j-1)*1.0/(m-1)), 1000)
END FORALL
This can call the PURE function mandelbrot which is essentially a generalized intrinsic
PURE INTEGER FUNCTION mandelbrot (x,itol)
  • COMPLEX, INTENT(IN) :: x
  • INTEGER, INTENT(IN) :: itol
  • COMPLEX xtmp
  • INTEGER k
    • k=0
    • xtmp = -x
    • DO WHILE( ABS(xtmp) < 2. .AND. k < itol )
    • xtmp = xtmp*xtmp - x
    • k = k + 1
    • END DO
    • mandelbrot = k
END FUNCTION mandelbrot

HTML version of Basic Foils prepared 17 Sept 1996

Foil 70 The INDEPENDENT Assertion in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF$ INDEPENDENT [ ,NEW (variable-list) ]
INDEPENDENT asserts that no iteration affects any other in any way
It implements the "embarassingly parallel" problem class we discussed under structure of problems
Note rest of HPF tackles mainly the synchronous problem class with some loosely synchronous capability
  • HPF2 has "tasking" for metaproblem class and some extensions for further irregular loosely synchronous problems
NEW variables are defined to have fresh instantiations for each iteration as is typically needed for embarassingly parallel problems where in fact essentially all variables in a loop would be NEW
Note INDEPENDENT can be applied to FORALL and asserts that no index point assigns to any location that another iteration index value uses
  • This reduces copying needed in FORALL by COMPILER
HPF2 (see later) has extra feature of allowing REDUCTION (accumulated) variables in INDEPENDENT DO loops

HTML version of Basic Foils prepared 17 Sept 1996

Foil 71 !HPF$ INDEPENDENT FORALL Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 72 !HPF$ INDEPENDENT DO Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 73 !HPF$ INDEPENDENT, NEW Variable

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
This is an exception from the conventional HPF picture of a global name space with either distributed or replicated variables

HTML version of Basic Foils prepared 17 Sept 1996

Foil 74 Extrinsics in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
An extrinsic function is a function written in a language other than HPF including most naturally any node programming language (e.g. Fortran77) targeted to a single processor SPMD) with message passing such as MPI
HPF defines (Fortran90) interface and invocation characteristics
  • This defines what is input and output via INTENT and these rules must be obeyed by callee (non HPF side)
  • If variables are implicitly replicated, the callee must make them consistent before reurn i.e. callee must respect HPF model of parallelism
  • HPF will execute any remapping commands and hands callee the "right" part of any distributed arrays
  • All processors are synchronized before call to EXTRINSIC function
Allows one to get efficient parallel code where HPF language or compiler inadequate
  • Analogous to calling assembly language from Fortran, Native classes from Java, C from PERL etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 75 High Performance Fortran HPF2 Changes

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
The original HPF 1.0 omitted some key capabilities which were known to be important but syntax and functionality was unclear in 1993
  • Further experience has shown that HPF compilers have proven to be difficult to write!
The HPF Forum met in 1995-96 and has approved a set of Extensions and Simplifications of HPF
The concept of a base HPF 2.0 and Approved Extensions has been agreed
Note approved extensions (which presumably vendors need not implement) include critical capability for dynamic irregular problems
DYNAMIC REALIGN and REDISTRIBUTE are no longer in base language and are just "appproved extensions"
Suprisingly no parallel I/O capabilities were approved!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 76 ON HOME for Computation Placement

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF$ INDEPENDENT
DO i = 1 , n
  • !HPF$ ON HOME( ix(i) )
  • x(i) = y(ix(i)) - y(iy(i))
END DO
This modifies the owner computes rule by specifying that computation will not be performed on processor owning left hand side
ON HOME is an approved extension

HTML version of Basic Foils prepared 17 Sept 1996

Foil 77 Reductions in INDEPENDENT DO Loops

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Many applications of INDEPENDENT DO loops do require reductions as they are typically calculating independently quantities but storing results as parts of various averages
e.g. in High Enegry Physics Data Analysis, each measured event can be computed via an INDEPENDENT DO but one wishes to find a particular observable (histogram, scatterplot) which is averaged over each event
Financial modelling is similar
x = 0
!HPF$ INDEPENDENT, NEW(xinc), REDUCTION(x)
do i = 1 , N
  • call sub(i, xinc)
  • x = x + xinc
END DO
xinc is a separate new variable each iteration but result is accumulated into global x

HTML version of Basic Foils prepared 17 Sept 1996

Foil 78 Spawning Tasks in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Task Parallelism is sort of supported in HPF but not clear to me that this is a great idea as better to keep sophisticated task parallelism outside HPF which is really only designed to support data parallelism
!HPF$ TASKING
  • !HPF$ ON HOME(p(1:8))
  • CALL foo(x,y)
  • !HPF$ ON HOME(p(9:16))
  • CALL bar(z)
!HPF$ END
This extends SPMD model with foo running on eight and bar on another processors
Note foo and bar are expected to contain data parallel statements which distribute execution using conventional HPF over 8 processors

HTML version of Basic Foils prepared 17 Sept 1996

Foil 79 New Data Mapping Features in HPF 2.0 - I

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
!HPF DISTRIBUTE x( BLOCK(SHADOW = 1 ) )
  • is designed to help compiler by specifying that it should set up a guard ring each side of BLOCK in each processor. In example Guard ring has extent 1
  • If x has dimension 100 and 4 processors, we allocate storage for
  • x(1..26) in Processor 1
  • x(24..51) in Processor 2
  • x(49..76) in Processor 3
  • x(74..100) in Processor 4
!HPF DISTRIBUTE x( BLOCK( /26,24,24,26/ ) )
  • is designed to specify general BLOCK distribution with in above example
  • x(1..26) in Processor 1
  • x(26..50) in Processor 2
  • x(51..76) in Processor 3
  • x(77..100) in Processor 4
!HPF DISTRIBUTE x( INDIRECT(map_array) )
  • is designed to allow an arbitary user array to specify location of x(i)
  • the processor number in map_array(i) would be typically be calculated by your favorite load balancing routine!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 80 New Data Mapping Features in HPF 2.0 - II

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *
Full HTML Index
Distribution is now allowed to Processor Subsets with typical Syntax:
  • !HPF$ PROCESSORS procs(1:np)
  • !HPF$ PROCESSORS b(BLOCK) ONTO procs(1:np/2-1)
Distribution is allowed for Derived Types but can only be done at ONE level
  • TYPE bunch_of_meshes
    • REAL A1(100,100,100), A2(100,100,100)
    • !HPF$ DISTRIBUTE (BLOCK,CYCLIC,*) :: A1,A2
  • END TYPE
  • Another Example shows distribution outside TYPE definition
  • TYPE tree with each node having one hundred children
    • TYPE(tree) , POINTER, DIMENSION(100) :: children
    • !HPF$ DYNAMIC children
    • REAL value
  • END TYPE
  • TYPE(tree) actualtree
  • ALLOCATE ( actualtree%children) allocates POINTERs to children
  • !HPF$ REDISTRIBUTE t%children(BLOCK) distributes (for first time) these pointers

© on Tue Oct 7 1997