Full HTML for Basic Overview of Fortran 90 and HPF Fall 96

Full HTML for

Basic foilset Overview of Fortran 90 and HPF Fall 96

Given by Geoffrey C. Fox, Tom Haupt at Basic Simulation Track for Computational Science CPS615 on Fall Semester 96. Foils prepared 17 Sept 1996
Outside Index Summary of Material

A brief discussion of Fortran90 and Fortran77 and why Fortran90 has advantages and disadvantages

Overview of Key Features of Fortran90

See Metcalf and Reid, Fortran90 Explained, Oxford Scientific Publications

Overview of Key Features of HPF

Parallel Constructs
Data Mapping
Examples

The Future -- HPF2

See Chuck Koelbel from Rice University at

http://renoir.csc.ncsu.edu/MRA/HTML/Workshop2/Koelbel

Table of Contents for full HTML of Overview of Fortran 90 and HPF Fall 96

Denote Foils where Image Critical

Denote Foils where HTML is sufficient

1

CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to High Performance Fortran and Fortran 90
2

Abstract of HPF and Fortran90 Technology Discussion
3

HPF is an extension of Fortran 90
4

Why is Fortran90 Easier than Fortran77
5

Important Features of Fortran90
6

Introduction to Fortran90 Arrays - I
7

Introduction to Fortran90 Arrays - II
8

Fortran90 Arrays and Memory Allocation
9

More on Fortran90 Arrays and Subroutines
10

Typical Use of Array and Intrinsic Operations
11

Derived Type in Fortran90
12

Examples of POINTER's in Fortran90
13

MODULEs in Fortran90
14

MODULEs INTERFACES and Overloaded Operators in Fortran90
15

Outline of HPF Discussion
16

Information on HPF and HPF Forum (HPFF)
17

Possible Programming Models
18

Data Parallel Programming Model
19

Parallelism in HPF
20

Fortran77 is part of Fortran90
21

HPF Features
22

What gives high performance in HPF
23

Compiler directives used in HPF
24

What does an HPF Compiler do?
25

Syntax of HPF Directives
26

Data Mapping in HPF
27

Staged Data Mapping in HPF
28

Template in HPF
29

Abstract Processors in HPF
30

Example of Template and Processors
31

Align Directive in HPF
32

Examples of Align Directive
33

Changing Rank in Align Directive
34

Replication in Align Directive
35

General Alignments in HPF
36

Formal Definition of Align Directive
37

More obscure Complicated Examples of Align Directive
38

Distribution Directive in HPF
39

Basic Examples of Distribute Directive
40

Two Dimensional Example of Distribute Directive
41

The Two Basic Distributions in HPF
42

The Example of Matrix Inversion
43

Example of Graphics Rendering
44

Example of Distribute Directive with Complex Alignment
45

Dynamic Data Mapping
46

Advanced Mapping Directives -- ReDistribution and ReAlign
47

Advanced Mapping Directives -- Allocatable arrays and pointers
48

Subprograms in HPF
49

Passing Distributed Arrays as Subprogram Arguments in HPF
50

Mapping Options for Dummy (Subroutine) Arguments
51

Inherit Distribution Directive in HPF
52

Summary of Mapping Directives in HPF
53

Fundamental Parallelism Assumption in HPF
54

Parallel statements and Constructs in HPF
55

Parallelism in Fortran 90 array assignments
56

WHERE (masked array assignment) in HPF
57

WHERE...ELSEWHERE / IF...ELSE constructs in HPF
58

Intrinsic functions in HPF
59

HPF library functions
60

SUM, SUM_PREFIX and SUM_SCATTER defined
61

HPF Intrinsic EXAMPLE: SUM
62

FORALL Statement in HPF
63

Examples of FORALL statements in HPF
64

Semantics of the FORALL statement in HPF
65

Vector Indices in FORALL's
66

Multiple Statement FORALL's
67

HPF FORALL construct Pictorially
68

PURE Functions in HPF
69

Example of PURE Function from Chuck Koelbel
70

The INDEPENDENT Assertion in HPF
71

!HPF$ INDEPENDENT FORALL Pictorially
72

!HPF$ INDEPENDENT DO Pictorially
73

!HPF$ INDEPENDENT, NEW Variable
74

Extrinsics in HPF
75

High Performance Fortran HPF2 Changes
76

ON HOME for Computation Placement
77

Reductions in INDEPENDENT DO Loops
78

Spawning Tasks in HPF
79

New Data Mapping Features in HPF 2.0 - I
80

New Data Mapping Features in HPF 2.0 - II

Outside Index Summary of Material

HTML version of Basic Foils prepared 17 Sept 1996

Foil 1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to High Performance Fortran and Fortran 90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Geoffrey Fox, Tom Haupt

NPAC

Room 3-131 CST

111 College Place

Syracuse NY 13244-4100

HTML version of Basic Foils prepared 17 Sept 1996

Foil 2 Abstract of HPF and Fortran90 Technology Discussion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

A brief discussion of Fortran90 and Fortran77 and why Fortran90 has advantages and disadvantages

Overview of Key Features of Fortran90

See Metcalf and Reid, Fortran90 Explained, Oxford Scientific Publications

Overview of Key Features of HPF

Parallel Constructs
Data Mapping
Examples

The Future -- HPF2

See Chuck Koelbel from Rice University at

http://renoir.csc.ncsu.edu/MRA/HTML/Workshop2/Koelbel

HTML version of Basic Foils prepared 17 Sept 1996

Foil 3 HPF is an extension of Fortran 90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

To express data parallelism and so hide machine dependent features of parallel programming from the user

We use Fortran90 as base language both because it is

more advanced than Fortran 77 with object oriented features and
The array syntax allows an elegant explictly parallel expression of some operations

Use of Fortran90 is a Problem because

It is a complex language which it is difficult to build compilers for
Not many people use Fortran90 and maybe they will all switch to Java before Fortran90 gets in common practice
So perhaps it is "too little too late" and we should focus on supporting the past (Fortran77) and the "correct" future ( (HP)Java) and not an irrelevant middle solution .....

HTML version of Basic Foils prepared 17 Sept 1996

Foil 4 Why is Fortran90 Easier than Fortran77

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Do 1 I=1,N

1 A(I)=B(I) is obviously parallel

Fortran90 Array Notation

A=B expresses this parallelism naturally in way compiler can easily detect in a deterministic way
Do 1 I=1,N
J=I
K=I

1 A(J)=B(K) is not so obviously parallel

Needs a difficult to define (in general case) algorithm (especially if IF statements, i.e. conditionals, defining I,J) to decide on existence and implementation of parallelism

Use of Fortran77 has "thrown away" natural parallelism at language level even though "run-time" restores as creates explicit values for variables such as J and K which are only known by analysis at compile time.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 5 Important Features of Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Arrays are very well supported with memory allocation and set of intrinsics, better passing to procedures etc.

This is key capability for HPF

Derived Types allow general object structure (without inheritance) in F90

Pointers
This area NOT well supported by HPF Compilers as of Summer 1996

Modules replace COMMON INCLUDE etc.

Procedures (functions,subroutines) allow better interfaces, recursion, optional parameters etc.

Better Syntax with free form, more loop control etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 6 Introduction to Fortran90 Arrays - I

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Arrays are "true" objects in Fortran90 and are stored as values of elements plus a data descriptor!

There are operations on full arrays which have natural parallel implementations seen in HPF

New set of array intrinsic (built-in) functions for elements, reductions (process array into a single value), tranformational (reshaping)

Note these functions are NOT sufficient for real problems and must have FORALL to define new parallel Array functions
FORALL is in HPF but not F90 -- Expected in next round of standard (Fortran95)
FORALL( I=0:nx)
- A(i) = (A(I+1)+A(I-1))/(I+1)
END FORALL

HTML version of Basic Foils prepared 17 Sept 1996

Foil 7 Introduction to Fortran90 Arrays - II

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Extract sections (subarrays) of arrays as u(lb:ub:step)

lb is Lower Bound
ub is Upper Bound
step (defaults to 1) is step

masked (conditional) Array operations using WHERE .... ELSEWHERE

Can still do Fortran77 array element operations (DO Loops) but of course this might not be interpretable for efficient parallelism by HPF compiler

Note Fortran90 designed for science and engineering with originally special concern for vector supercomputers but Cray supports F77 better than F90(!)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 8 Fortran90 Arrays and Memory Allocation

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

ALLOCATABLE Arrays can be defined at runtime with variable sizing

REAL, ALLOCATABLE :: u(:,:) , f(:,:)
ALLOCATE ( u(0:nx,0:ny) , f(1:27,0:ny) )

One can define POINTER and TARGET attributes which can be used like REAL, DIMENSION etc.

=> operator allows one to set a POINTER to "point to" a TARGET

Arguments of a subroutine need NOT define array dimensions in subroutine as these as passed by calling program in data descriptor

Local arrays are created on stack and bounds maybe non constant and evaluated at procedure entry

HTML version of Basic Foils prepared 17 Sept 1996

Foil 9 More on Fortran90 Arrays and Subroutines

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

One passes "assumed-shape" arrays from calling to callee routines using INTERFACE syntax

INTERFACE

SUBROUTINE residual (r,u,f)
- REAL r(:,:) , u(:,:) , F(:,:)
END SUBROUTINE

END INTERFACE is called by

call residual (r,u,f) or

call residual ( r(0:nx:2, 0:ny:2) , u(0:nx:2, 0:ny:2) , f(0:nx:2, 0:ny:2) )

where latter example just processes every other element of arrays

HTML version of Basic Foils prepared 17 Sept 1996

Foil 10 Typical Use of Array and Intrinsic Operations

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

REAL u(0:nx,0:ny), A(100,100) , fact , avg

u= fact * (u -avg) Scales and translates all elements of u

avg = .25*( CSHIFT(u,1,1) + CSHIFT(u,-1,1) + CSHIFT(u,1,2) + CSHIFT(u,-1,2)

calculates of average of 4 array elements surrounding each point. Note third argument in CSHIFT is label for axis (1=x 2=y)

SQRT( A(1:100) ) calculates a new array containing 100 square roots

SUM(A) is a reduction operator sumimg all elements of array A as a scalar

SIZE(A,1) is an Array Query Intrinsic giving size of A in the first dimension and is particularly useful for "assumed-shape" arrays passed into subroutines

HTML version of Basic Foils prepared 17 Sept 1996

Foil 11 Derived Type in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

TYPE PERSON

CHARACTER(LEN=10) NAME
REAL AGE
INTEGER ID

END TYPE PERSON

TYPE(PERSON) YOU,ME

The Identification number of YOU would be accessed as YOU%ID as an ordinary integer

One can define global operators so that YOU+ME could be defined

One can use name of derived type as a constructor

YOU = PERSON ('Pamela Fox', 12, 3)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 12 Examples of POINTER's in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

One can define a linked list as:

TYPE ENTRY

REAL VALUE
INTEGER INDEX
TYPE(ENTRY), POINTER :: NEXT

END TYPE ENTRY

ALLOCATE Creates dynamically elements in a linked list

CURRENT = ENTRY( NEW_VALUE, NEW_INDEX, FIRST)

FIRST => CURRENT

adds a new entry at start of linked list and renames it with POINTER FIRST

HTML version of Basic Foils prepared 17 Sept 1996

Foil 13 MODULEs in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

General Syntax is:

MODULE name

Specify it!

CONTAINS This is optional

module subprograms

END MODULE name

MODULE IllustratingCommonBlock

INTEGER DIMENSION(52) :: CARDS

END MODULE IllustratingCommonBlock

replaces COMMON construct and can be used as

USE IllustratingCommonBlock

HTML version of Basic Foils prepared 17 Sept 1996

Foil 14 MODULEs INTERFACES and Overloaded Operators in Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

MODULE INTERVAL_ARITHMETIC

TYPE INTERVAL
- REAL LOWER, UPPER
END TYPE INTERVAL
INTERFACE OPERATOR(+) define overloaded + operator
- MODULE PROCEDURE ADD_INTERVALS
END INTERFACE

CONTAINS

FUNCTION ADD_INTERVALS(A,B)
- TYPE(INTERVAL) ADD_INTERVALS, A, B
- ADD_INTERVALS%LOWER = A%LOWER + B%LOWER
- ADD_INTERVALS%UPPER = A%UPPER + B%UPPER
END FUNCTION ADD_INTERVALS(A,B)

END MODULE INTERVAL_ARITHMETIC

HTML version of Basic Foils prepared 17 Sept 1996

Foil 15 Outline of HPF Discussion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

What is HPF, what we need it for, where it came from

How does HPF Get its Parallelism

Why it is called "High Performance"?

What are HPF compiler directives

Data mapping in HPF

TEMPLATE PROCESSORS DISTRIBUTE ALIGN

Parallel statements and constructs in HPF

Array statements, WHERE/ELSEWHERE, Intrinsics, FORALL, PURE, INDEPENDENT

Latest Discussions -- HPF-2

ON HOME, TASKING, Dynamic Data Mapping, Reductions in INDEPENDENT DO loops

HTML version of Basic Foils prepared 17 Sept 1996

Foil 16 Information on HPF and HPF Forum (HPFF)

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Rice has taken lead in HPF Forum which is a much faster mechanism of getting agreement than formal 10 year process which Fortran90 suffered

World Wide Web at rice and Vienna

http://www.crpc.rice.edu/HPFF/home.html
http://www.vcpc.univie.ac.at/HPFF/home.html

Mailing List is majordomo@cs.rice.edu and choose list (hpff, hpff-interpret, hpff-core) you wish to subscribe to

Anonymous FTP to titan.cs.rice.edu and look at

public/HPFF/README for latest list of files

HTML version of Basic Foils prepared 17 Sept 1996

Foil 17 Possible Programming Models

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Explicit Message Passing as in PVM or MPI

User breaks program into parts and the parts send messages between them to implement communication necessary for synchronization and integration of parts into solution of a single program

This matches hardware but is not particularly natural for problem and can be machine dependent

Object Oriented programming is like message passing but now objects and not programs communicate

Very good when objects are natural from the problem and represent functional parallelism
However in data parallel problems tackled with object oriented approach, one must break problem up into a number of objects that depends on number of processors and so reflects machine and not problem

HTML version of Basic Foils prepared 17 Sept 1996

Foil 18 Data Parallel Programming Model

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Data Parallelism is higher level than either message passing or object models (if objects used to break up data to respect computer)

It provides a Shared Memory Programming Model which can be executed on SIMD or MIMD computers, distributed or shared memory computers

Note it specifies problem not machine structure

It in principle provides the most attractive machine independent model for programmers as it reflects problem and not computer

Its disadvantage is that hard to build compilers especially for the most interesting new algorithms which are dynamic and irregular!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 19 Parallelism in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Parallelism in HPF is expressed explicitly

Fortran 90 array expressions and assignments (including WHERE)
HPF Library and Array intrinsics
FORALL statement and construct
- PURE labels procedures that can be used in FORALL as they have no "side-effects"
INDEPENDENT assertion on DO loops

Compiler may choose not to exploit information about parallelism

Compiler may detect parallelism in sequential code

HTML version of Basic Foils prepared 17 Sept 1996

Foil 20 Fortran77 is part of Fortran90

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

A=B or more interestingly

WHERE( B > 0. ) A = B

ELSEWHERE A=0.

END WHERE

can be written

DO I = n1,n2

DO J = m1,m2

IF(B(I,J) >0.) THEN A(I,J) = B(I,J)

ELSE A(i,J) = 0.

END IF

END DO

Now a good HPF compiler will recognize the DO loops can be parallelized and give the same answer for Fortran90 and Fortran77 forms but often the detection of parallelism is not clear

Note FORALL is guaranteed to be parallelizeable as by definition no side effects.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 21 HPF Features

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

All of Fortran90

New instructions FORALL and INDEPENDENT enhancing DO loops

Data Alignment and Distribution Assertions

Miscellaneous Support Operations but

NO parallel Input/Output

Little Support for Irregular Computations

Little Support for any form of non mainstream data-parallelism

Extrinsics as supporting links with explicit message-passing

HTML version of Basic Foils prepared 17 Sept 1996

Foil 22 What gives high performance in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

There is tradeoff between parallelism and communication

Programmer defines the data mapping and compiler uses this to assign processing

Underlying assumptions are that:

An operation on two or more data object is likely to be carried out much faster if they all reside in the same processor,

And that it may be possible to carry out many such operations concurrently if they can be performed on different processors

This is embodied in "owner computes" rule -- namely that in for instance

A(i,j)= .....
One brings everything on right hand side to process "owning" A(i,j) and performs computation in this processor

Owner computes algorithm is usually good and often best

HTML version of Basic Foils prepared 17 Sept 1996

Foil 23 Compiler directives used in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

The directives are structured comments that suggest implementation strategies or assert facts about a program to the compiler

They may affect the efficiency of the computation performed, but do not change the value computed by the program

As in Fortran 90 statements, there are both:

declarative directives
executable directives

HTML version of Basic Foils prepared 17 Sept 1996

Foil 24 What does an HPF Compiler do?

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

It must generate Fortran77(90) + Message Passing code or possibly in one pass map HPF code onto parallel machine code

Traditional dataflow and dependency analysis is especially critical in Fortran77 parts of code

It must use data mapping assertions to decide what is stored where and so organize computation

Code must be transformed to respect this owner-computes model

It must typically use "Loosely Synchronous" model with communicate-compute phases and then compiler generates all the communication needed

Due to latency issues compiler must minimize communication needed and maximize size of packets sent

We need an excellent run-time library which the compiler invokes with parallel Intrinsics etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 25 Syntax of HPF Directives

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HPF directives are consistent with Fortran 90 syntax except for the special prefix for directive:

!HPF$ (only version allowed with free format)
CHPF$
*HPF$

Two forms of the directives are allowed

Specification statements, such as
!HPF$ DISTRIBUTE MYTEMPLATE(BLOCK) ONTO P
Equivalent Attributed form, such as,
!HPF$ DISTRIBUTE (BLOCK) ONTO P :: MYTEMPLATE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 26 Data Mapping in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Data Mapping in HPF is all you need to do to get parallelism as long as you use the explicit array type syntax such as A=B+C

The Owner Computes rule implies that specifying location of variables specifies (optimally or not) parallel execution!

The new HPF-2 ON HOME directive is exception to this rule as specifies where a particular statement is to be executed

(RE)DISTRIBUTE tells you where data is to be placed

(RE)ALIGN tells you how different data structures are to be placed relative to each other

HTML version of Basic Foils prepared 17 Sept 1996

Foil 27 Staged Data Mapping in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 28 Template in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

A Template is an abstract space of indexed positions (an "array of nothings")

In CMFortran terminology, Template is set of Virtual Processors -- one per data point

Template typically specifies precisely the full natural data parallelism that is natural for problem and HPF maps this problem parallelism onto particular machine

A template is declared by the TEMPLATE directive that specifies:

name of the template
the rank (i.e., number of dimensions)
the extent in each dimension

Examples:

CHPF$ TEMPLATE T(1000)
!HPF$ TEMPLATE FRED(N, 2*N)
*HPF$ TEMPLATE, DIMENSION(5,100,50) :: MINE, YOURS

HTML version of Basic Foils prepared 17 Sept 1996

Foil 29 Abstract Processors in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Abstract processors always form a rectilinear grid in 1 or more dimensions

They are abstract coarse grain collections of data-points

Remember efficiency says we must "block" communication into large chunks -- processors give us a general target for this

The processor arrangement is defined by the PROCESSORS directive that specifies:

name of the processor arrangement
the rank (i.e., number of dimensions)
the extend in each dimension

Examples:

!HPF$ PROCESSORS P(N)
*HPF$ PROCESSORS BIZARRO(1972:1997,-20:17)
CHPF$ PROCESSORS SCALARPROC (by default sequential)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 30 Example of Template and Processors

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

REAL, DIMENSION(40) :: A, B, C

!HPF$ PROCESSORS P(4)

!HPF$ TEMPLATE X(40)

!HPF$ ALIGN WITH X :: A, B, C

!HPF$ DISTRIBUTE X(BLOCK)

...
C = A + B
...

HTML version of Basic Foils prepared 17 Sept 1996

Foil 31 Align Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Syntax of Align:

!HPF ALIGN alignee WITH align-target

where -- note [..] implies optional component
alignee: alignee [(align-source-list)]
align target: align-target[(align-subscript-list)]

Alternatively

*HPF ALIGN (align-source-list) WITH align-target :: alignee

HTML version of Basic Foils prepared 17 Sept 1996

Foil 32 Examples of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Note a colon(:) in directive denotes all values of array index

Examples of array indices:

CHPF$ ALIGN A(i) WITH B(i)
*HPF$ ALIGN (i,j) WITH TEMPL(i,j) :: A, B

Use of : examples:

!HPF$ ALIGN A(:) WITH B(:)
CHPF$ align (:,:) WITH TEMPL(:,:) :: A, B

HTML version of Basic Foils prepared 17 Sept 1996

Foil 33 Changing Rank in Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Ranks of the alignee and the align-target may be different

Examples:

!HPF$ ALIGN A(:,j) WITH B(:)
CHPF$ ALIGN A(:,*) WITH B(:)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 34 Replication in Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

... or other way round

!HPF$ ALIGN A(:) WITH TEMPL(:,*)

while this only puts A on some parts of template...

!HPF$ ALIGN A(:) WITH TEMPL(:,i)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 35 General Alignments in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HPF allows for more general alignments such as:

REAL, DIMENSION(5,8) :: A,B

!HPF$ TEMPLATE T(12,12)

!HPF$ ALIGN A(:,J) WITH T(:,J+1)

!HPF$ ALIGN B(I,J) WITH T(I+4,J+4)

Useful for simple numerical shifts as in example but not useful

in general case of arbitary

index values allowed by

ALIGN syntax

HTML version of Basic Foils prepared 17 Sept 1996

Foil 36 Formal Definition of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Each align-dummy variable is considered to range over all valid index values for the corresponding dimension of the alignee. An align-subscript is evaluated for any specific combination of values for the align-dummy variables simply by evaluating each align-subscript as a expression. Their resulting subscript values must be legitimate subscripts for the align-target

HTML version of Basic Foils prepared 17 Sept 1996

Foil 37 More obscure Complicated Examples of Align Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

These examples have non-unit stride as perhaps in "red-black" Iterative Solver algorithms:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 38 Distribution Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Syntax:

!HPF$ DISTRIBUTE distributee (dist-format)

[ONTO dist-target]

Allowed forms of dist-format:

* -- Implies no distribution in this index
BLOCK -- Critical to minimize communication
CYCLIC -- Critical for load balancing
BLOCK(int-expr) -- Not Obviously useful!
CYCLIC(int-expr) -- Very useful

Examples:

CHPF$ DISTRIBUTE TEMP(BLOCK,CYCLIC)
!HPF$ DISTRIBUTE FRED(BLOCK(10)) ONTO P
*HPF$ DISTRIBUTE (BLOCK,*) :: MYTEMPLATE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 39 Basic Examples of Distribute Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

!HPF$ PROCESSORS P(4)

REAL, DIMENSION(16) :: A

!HPF$ TEMPLATE T(16)

!HPF$ ALIGN A(:) WITH T(:)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 40 Two Dimensional Example of Distribute Directive

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

DIMENSION, REAL(4,4) :: A

*HPF PROCESSORS SQUARE(2,2)

*HPF TEMPLATE T(4,4)

*HPF ALIGN A(:,:) WITH T(:,:)

*HPF DISTRIBUTE T(BLOCK,CYCLIC)ONTO SQUARE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 41 The Two Basic Distributions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

We used BLOCK in the Laplace equation example and so this is appropriate distribution for "local" or geometric type problems

CYCLIC is called scattered in our early work (or is a special case of scattered which is perhaps random distribution of objects on processors) is appropriate in cases where "load-balancing" is more important than locality

Simplest examples are matrix inversion and graphics rendering problems
In solving equations (we will do later) Ax=b , there is no "nearest neighbor" structure between rows and columns, but rather one eliminates rows and columns and cyclic distribution ensures work remains balanced
In calculating pixels, work depends on complexity of picture at that pixel and so best to distribute pixels cyclically (or randomly) to processors.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 42 The Example of Matrix Inversion

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Matrix Inversion set up on two processors after

0 2 and 4 rows/columns eliminated

Note BLOCK decomposition leads to all work being on one processor at end even if starts off balanced

HTML version of Basic Foils prepared 17 Sept 1996

Foil 43 Example of Graphics Rendering

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Here we show a 16 by 16 array of pixels with either CYCLIC or 8 by 8 two dimensional BLOCK,BLOCK

HTML version of Basic Foils prepared 17 Sept 1996

Foil 44 Example of Distribute Directive with Complex Alignment

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

REAL, DIMENSION(12,16) :: A
REAL, DIMENSION(8,10) :: B

CHPF$ PROCESSORS Q(4)

CHPF$ TEMPLATE FRED(16,16)

CHPF$ ALIGN A(:,:) WITH FRED(:,:)

CHPF$ ALIGN B(I,J) WITH FRED(I+2,J+2)

CHPF$ DISTRIBUTE FRED(BLOCK,*)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 45 Dynamic Data Mapping

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

One data mapping is often not appropriate for an entire program

Often one has phases in which different distributions are needed in different phases
e.g. in 2D FFT, one typically finds FFT of F(I,J) by first distributing so for each J all I (x values) are in same processor and then transform so that for each I all J are in same processor
This ensures no communication in FFT phases which is important as typically in distributed one dimensional FFT there is substantial overhead

ALLOCATABLE arrays can change size

REALIGN and REDISTRIBUTE are executable DISTRIBUTE and ALIGN commands but are only to be used if one declares arrays on which they act DYNAMIC

Naturally DYNAMIC arrays can be initialized by ALIGN or DISTRIBUTE statements

HTML version of Basic Foils prepared 17 Sept 1996

Foil 46 Advanced Mapping Directives -- ReDistribution and ReAlign

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

This example illustrates remapping from one to two dimensional decomposition for A and changing B from alignment with columns to alignment with rows

REAL, DIMENSION(64,64) :: A
REAL, DIMENSION(64) :: B

!HPF$ PROCESSORS P(64)

!HPF$ PROCESSORS Q(8,8)

!HPF$ DYNAMIC :: A,B

!HPF$ ALIGN B(:) WITH A(:,*)

!HPF$ DISTRIBUTE A(*,BLOCK)ONTO P

...

!HPF$ REALIGN B(:) WITH A(*,:)

...

!HPF$ REDISTRIBUTE A(CYCLIC,CYCLIC) ONTO Q

...

HTML version of Basic Foils prepared 17 Sept 1996

Foil 47 Advanced Mapping Directives -- Allocatable arrays and pointers

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

SUBROUTINE SUB(N,M)
REAL, ALLOCATABLE, DIMENSION(:) :: A,B
REAL, POINTER :: P(:)

!HPF$ PROCESSORS Q(64)

!HPF$ ALIGN B(I) WITH A(I+N)

!HPF$ DISTRIBUTE A(BLOCK(M))

!HPF$ DISTRIBUTE(BLOCK), DYNAMIC :: P

...
ALLOCATE(A(128))
ALLOCATE(B(64))
ALLOCATE(P(1024))
...

!HPF$ REDISTRIBUTE P(CYCLIC)

...
RETURN
END

HTML version of Basic Foils prepared 17 Sept 1996

Foil 48 Subprograms in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Scope of any mapping directives is a single (sub)program unit

A template or distribution is not a first-class Fortran 90 object:

It cannot be passed as a subprogram argument and this creates significant complication!

HPF Compiler will typically pass an extra argument which is effectively an array-descriptor telling subroutine about distribution of passed arrays

One can use array query intrinsics to find out what is going on but of course compiler does this implicitly

HTML version of Basic Foils prepared 17 Sept 1996

Foil 49 Passing Distributed Arrays as Subprogram Arguments in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

There are three typical cases:

Subroutine requires data to use a particular mapping determined by subroutine

Arguments must be remapped

Subroutine can use any mapping so actual argument should be passed and used with current mapping

Here we have two cases depending on whether programmer knows or not (and tells subroutine) what incoming distribution is

Sometimes we need to remap due to array sections being passed

Any remappings must be undone on return from subroutine

HTML version of Basic Foils prepared 17 Sept 1996

Foil 50 Mapping Options for Dummy (Subroutine) Arguments

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

DISTRIBUTE

use * instead of dist-format or ONTO clause indicates that incoming distribution is acceptable i.e. leave data in place
* before dist-format or ONTO clause indicates that data should stay in place and asserts that distribution is what you claim

ALIGN

* instead of or before target has similar meanings to DISTRIBUTE

INHERIT

A new attribute allowing references back to the original full array and used when sections of array are passed

HTML version of Basic Foils prepared 17 Sept 1996

Foil 51 Inherit Distribution Directive in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

(not a comprehensive discussion; just an example)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 52 Summary of Mapping Directives in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

PROCESSORS

TEMPLATE

ALIGN

DISTRIBUTE

INHERIT

DYNAMIC

REALIGN

REDISTRIBUTE

HTML version of Basic Foils prepared 17 Sept 1996

Foil 53 Fundamental Parallelism Assumption in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

An operation on two or more data object is likely to be carried out much faster if they all reside in the same processor

i.e. minimize communication

it may be possible to carry out many such operations concurrently if they can be performed on different processors

data parallelism

HTML version of Basic Foils prepared 17 Sept 1996

Foil 54 Parallel statements and Constructs in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Parallel Statements

Fortran 90 array assignments
masked array assignments (WHERE)
FORALL statement

Parallel Constructs

WHERE and WHERE...ELSEWHERE construct
FORALL construct
INDEPENDENT DO

Intrinsic functions and the HPF library

Extrinsic functions

HTML version of Basic Foils prepared 17 Sept 1996

Foil 55 Parallelism in Fortran 90 array assignments

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

This is as in CMFortran and Maspar MPFortran with example:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 56 WHERE (masked array assignment) in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

This is as in CMFortran and Maspar MPFortran with example:

WHERE (A .GT. 0) A = A - 100

Semantics of WHERE statement:

1. evaluate mask (in parallel) and store as a temporary T1
2. for each i that T1(i)=.TRUE. compute T2(i)=A(i) - 100
3. for each i that T1(i)=.TRUE. assign A(i)=T2(1)

HTML version of Basic Foils prepared 17 Sept 1996

Foil 57 WHERE...ELSEWHERE / IF...ELSE constructs in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

There is a fundamental difference in semantics between IF...ELSE and WHERE...ELSEWHERE constructs

HTML version of Basic Foils prepared 17 Sept 1996

Foil 58 Intrinsic functions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

elemental

examples:
- A = SIN(X)
- FORALL (i=1:100:2) A(i) = EXP(A(i))

transformational and inquiry functions

Fortran 90
- SUM, PRODUCT, ANY, DOTPROD, EOSHIFT, MAXVAL, ...
HPF
- system inquiry functions:
- NUMBER_OF_PROCESSORS,
- PROCESSORS_SHAPE
- extensions of MAXLOC and MINLOC, ILEN
HPF library

HTML version of Basic Foils prepared 17 Sept 1996

Foil 59 HPF library functions

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

new array reduction functions

IALL, IANY, IPARITY and PARITY
(IAND, IOR, IEOR, and .NEQV.)

array combining scatter functions

XXX_SCATTER

array prefix and suffix functions

XXX_PREFIX, XXX_SUFFIX

array sorting functions

GRADE_DOWN, GRADE_UP

bit manipulation functions

LEADZ, POPCNT, POPPAR

mapping inquiry subroutines

HPF_ALIGNMENT, HPF_TEMPLATE, HPF_DISTRIBUTION

HTML version of Basic Foils prepared 17 Sept 1996

Foil 60 SUM, SUM_PREFIX and SUM_SCATTER defined

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

X=SUM(A) sums all elements of A and places result in scalar X

Y = SUM_PREFIX(A) sets array Y of same size as A so that Y(i) has the sum of all A(j) for 1 <= j <= i

Y = SUM_SCATTER(A,B, IND) sets array element Y(i) as the sum of array element B(i) plus those elements of A(j) where IND(j) = I

X = SUM_SCATTER( flux, X, INDEX) is equivalent to
FORALL (i=1:N)
- X(INDEX(i)) = X(INDEX(i)) + flux(i)
END FORALL assuming INDEX(i) just permutes numbers 1 to N and has no repeated values

HTML version of Basic Foils prepared 17 Sept 1996

Foil 61 HPF Intrinsic EXAMPLE: SUM

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 62 FORALL Statement in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

A very important extension to Fortran 90 and defines one class of parallel DO loop

FORALL will be a language feature of Fortran95

It relaxes the restriction that operands of the rhs expressions must be conformable with the lhs array

It may be masked with a scalar logical expression (extension of WHERE construct)

A FORALL statement may call user-defined (PURE) functions on the elements of an array, simulating Fortran 90 elemental function invocation (albeit with a different syntax)

FORALL( index-spec-list [,mask-expr] ) forall assignment

where forall-assignment is conventional single Fortran90 statement

HTML version of Basic Foils prepared 17 Sept 1996

Foil 63 Examples of FORALL statements in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

FORALL (i=1:100,k=1:100) a(i,k) = b(i,k) A = B

FORALL (i=2:100:2) a(i) = a(i-1) A(2:100:2) = A(1:99:2)

FORALL (i=1:100) a(i) = i A = [1..100]

FORALL (i=1:100, j=1:100) a(i, j) = i+j

FORALL (i=1,100) a(i,i) = b(i)

FORALL (i=1,100,j=1:100) a(i,j) = b(j,i)

FORALL (i=1,100) a(i, 1:100) = b(1:100, i)

FORALL (i=1:100, j=1:100, y(i,j).NE.0) x(i,j) = REAL(i+j)/y(i,j)

FORALL (i=1,100) a(i,ix(i)) = x(i)

FORALL (i=1,9) x(i) = SUM(x(1:10:i))

FORALL (i= 1,100) a(i) = myfunction(a(i+1))

HTML version of Basic Foils prepared 17 Sept 1996

Foil 64 Semantics of the FORALL statement in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Similar to Fortran 90 array assignments and WHERE

Consider example:

HTML version of Basic Foils prepared 17 Sept 1996

Foil 65 Vector Indices in FORALL's

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Consider FORALL( i=1:n ) a(ix(i)) = a(i)

is allowed in HPF but will only give sensible reproducible results if ix(i) is a true permutation of 1...n

but if ix(i) has repeated values, then the result is undefined

Of course we always use "old" value" of a(i) on rhs so that if ix(i)= i+1 and a(0) defined, then result is

a(i)/new = a(i-1)/old
and not result of recursion
a(1)/new =a(0)/old
a(2)/new = a(1)/new just calculated = a(0)/old

HTML version of Basic Foils prepared 17 Sept 1996

Foil 66 Multiple Statement FORALL's

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

FORALL( index-spec-list [,mask-expr] )

forall-body

END FORALL

where forall-body can be a list of forall-assignment statements, FORALL or WHERE statements

So Multi-Statement FORALL's support nesting of FORALL's but

is in general Shorthand for a sequence of single statement FORALL's with by definition each statement completed before next one begins

The multi-statement FORALL is likely to be more efficient than several single statement ones as latter have synchronization overhead on each statement

HTML version of Basic Foils prepared 17 Sept 1996

Foil 67 HPF FORALL construct Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 68 PURE Functions in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

PURE functions have no side effects

ALL Intrinsics are PURE

DO loops can call any functions and parallelism unclear as function call can destroy parallelism

DO I=1,1000
- A(I) = FUNC(A(I-1),X)
END DO

If FUNC alters A(I-1) or in fact A(any index except I), then this loop cannot be easily parallelized

FORALL statements can only call PURE functions and these must NOT define any global (e.g. any element of A in example) or dummy (A(i-1) or X) variable

PURE functions can only INHERIT distribution and alignment statements

HTML version of Basic Foils prepared 17 Sept 1996

Foil 69 Example of PURE Function from Chuck Koelbel

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

FORALL( i=1:n, j=1:m )

k(i,j) = mandelbrot ( CMPLX((i-1)*1.0/(n-1), (j-1)*1.0/(m-1)), 1000)

END FORALL

This can call the PURE function mandelbrot which is essentially a generalized intrinsic

PURE INTEGER FUNCTION mandelbrot (x,itol)

COMPLEX, INTENT(IN) :: x
INTEGER, INTENT(IN) :: itol
COMPLEX xtmp
INTEGER k
- k=0
- xtmp = -x
- DO WHILE( ABS(xtmp) < 2. .AND. k < itol )
- xtmp = xtmp*xtmp - x
- k = k + 1
- END DO
- mandelbrot = k

END FUNCTION mandelbrot

HTML version of Basic Foils prepared 17 Sept 1996

Foil 70 The INDEPENDENT Assertion in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

!HPF$ INDEPENDENT [ ,NEW (variable-list) ]

INDEPENDENT asserts that no iteration affects any other in any way

It implements the "embarassingly parallel" problem class we discussed under structure of problems

Note rest of HPF tackles mainly the synchronous problem class with some loosely synchronous capability

HPF2 has "tasking" for metaproblem class and some extensions for further irregular loosely synchronous problems

NEW variables are defined to have fresh instantiations for each iteration as is typically needed for embarassingly parallel problems where in fact essentially all variables in a loop would be NEW

Note INDEPENDENT can be applied to FORALL and asserts that no index point assigns to any location that another iteration index value uses

This reduces copying needed in FORALL by COMPILER

HPF2 (see later) has extra feature of allowing REDUCTION (accumulated) variables in INDEPENDENT DO loops

HTML version of Basic Foils prepared 17 Sept 1996

Foil 71 !HPF$ INDEPENDENT FORALL Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 72 !HPF$ INDEPENDENT DO Pictorially

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

HTML version of Basic Foils prepared 17 Sept 1996

Foil 73 !HPF$ INDEPENDENT, NEW Variable

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

This is an exception from the conventional HPF picture of a global name space with either distributed or replicated variables

HTML version of Basic Foils prepared 17 Sept 1996

Foil 74 Extrinsics in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

An extrinsic function is a function written in a language other than HPF including most naturally any node programming language (e.g. Fortran77) targeted to a single processor SPMD) with message passing such as MPI

HPF defines (Fortran90) interface and invocation characteristics

This defines what is input and output via INTENT and these rules must be obeyed by callee (non HPF side)
If variables are implicitly replicated, the callee must make them consistent before reurn i.e. callee must respect HPF model of parallelism
HPF will execute any remapping commands and hands callee the "right" part of any distributed arrays
All processors are synchronized before call to EXTRINSIC function

Allows one to get efficient parallel code where HPF language or compiler inadequate

Analogous to calling assembly language from Fortran, Native classes from Java, C from PERL etc.

HTML version of Basic Foils prepared 17 Sept 1996

Foil 75 High Performance Fortran HPF2 Changes

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

The original HPF 1.0 omitted some key capabilities which were known to be important but syntax and functionality was unclear in 1993

Further experience has shown that HPF compilers have proven to be difficult to write!

The HPF Forum met in 1995-96 and has approved a set of Extensions and Simplifications of HPF

The concept of a base HPF 2.0 and Approved Extensions has been agreed

Note approved extensions (which presumably vendors need not implement) include critical capability for dynamic irregular problems

DYNAMIC REALIGN and REDISTRIBUTE are no longer in base language and are just "appproved extensions"

Suprisingly no parallel I/O capabilities were approved!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 76 ON HOME for Computation Placement

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

!HPF$ INDEPENDENT

DO i = 1 , n

!HPF$ ON HOME( ix(i) )
x(i) = y(ix(i)) - y(iy(i))

END DO

This modifies the owner computes rule by specifying that computation will not be performed on processor owning left hand side

ON HOME is an approved extension

HTML version of Basic Foils prepared 17 Sept 1996

Foil 77 Reductions in INDEPENDENT DO Loops

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Many applications of INDEPENDENT DO loops do require reductions as they are typically calculating independently quantities but storing results as parts of various averages

e.g. in High Enegry Physics Data Analysis, each measured event can be computed via an INDEPENDENT DO but one wishes to find a particular observable (histogram, scatterplot) which is averaged over each event

Financial modelling is similar

x = 0

!HPF$ INDEPENDENT, NEW(xinc), REDUCTION(x)

do i = 1 , N

call sub(i, xinc)
x = x + xinc

END DO

xinc is a separate new variable each iteration but result is accumulated into global x

HTML version of Basic Foils prepared 17 Sept 1996

Foil 78 Spawning Tasks in HPF

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Task Parallelism is sort of supported in HPF but not clear to me that this is a great idea as better to keep sophisticated task parallelism outside HPF which is really only designed to support data parallelism

!HPF$ TASKING

!HPF$ ON HOME(p(1:8))
CALL foo(x,y)
!HPF$ ON HOME(p(9:16))
CALL bar(z)

!HPF$ END

This extends SPMD model with foo running on eight and bar on another processors

Note foo and bar are expected to contain data parallel statements which distribute execution using conventional HPF over 8 processors

HTML version of Basic Foils prepared 17 Sept 1996

Foil 79 New Data Mapping Features in HPF 2.0 - I

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

!HPF DISTRIBUTE x( BLOCK(SHADOW = 1 ) )

is designed to help compiler by specifying that it should set up a guard ring each side of BLOCK in each processor. In example Guard ring has extent 1
If x has dimension 100 and 4 processors, we allocate storage for
x(1..26) in Processor 1
x(24..51) in Processor 2
x(49..76) in Processor 3
x(74..100) in Processor 4

!HPF DISTRIBUTE x( BLOCK( /26,24,24,26/ ) )

is designed to specify general BLOCK distribution with in above example
x(1..26) in Processor 1
x(26..50) in Processor 2
x(51..76) in Processor 3
x(77..100) in Processor 4

!HPF DISTRIBUTE x( INDIRECT(map_array) )

is designed to allow an arbitary user array to specify location of x(i)
the processor number in map_array(i) would be typically be calculated by your favorite load balancing routine!

HTML version of Basic Foils prepared 17 Sept 1996

Foil 80 New Data Mapping Features in HPF 2.0 - II

From New CPS615HPF and Fortran90 Discussion Sept 17 96 Basic Simulation Track for Computational Science CPS615 -- Fall Semester 96. *

Full HTML Index

Distribution is now allowed to Processor Subsets with typical Syntax:

!HPF$ PROCESSORS procs(1:np)
!HPF$ PROCESSORS b(BLOCK) ONTO procs(1:np/2-1)

Distribution is allowed for Derived Types but can only be done at ONE level

TYPE bunch_of_meshes
- REAL A1(100,100,100), A2(100,100,100)
- !HPF$ DISTRIBUTE (BLOCK,CYCLIC,*) :: A1,A2
END TYPE
Another Example shows distribution outside TYPE definition
TYPE tree with each node having one hundred children
- TYPE(tree) , POINTER, DIMENSION(100) :: children
- !HPF$ DYNAMIC children
- REAL value
END TYPE
TYPE(tree) actualtree
ALLOCATE ( actualtree%children) allocates POINTERs to children
!HPF$ REDISTRIBUTE t%children(BLOCK) distributes (for first time) these pointers

Basic foilset Overview of Fortran 90 and HPF Fall 96

Table of Contents for full HTML of Overview of Fortran 90 and HPF Fall 96

Foil 1 CPS615 -- Base Course for the Simulation Track of Computational ScienceFall Semester 1996 --Introduction to High Performance Fortran and Fortran 90

Foil 2 Abstract of HPF and Fortran90 Technology Discussion

Foil 3 HPF is an extension of Fortran 90

Foil 4 Why is Fortran90 Easier than Fortran77

Foil 5 Important Features of Fortran90

Foil 6 Introduction to Fortran90 Arrays - I

Foil 7 Introduction to Fortran90 Arrays - II

Foil 8 Fortran90 Arrays and Memory Allocation

Foil 9 More on Fortran90 Arrays and Subroutines

Foil 10 Typical Use of Array and Intrinsic Operations

Foil 11 Derived Type in Fortran90

Foil 12 Examples of POINTER's in Fortran90

Foil 13 MODULEs in Fortran90

Foil 14 MODULEs INTERFACES and Overloaded Operators in Fortran90

Foil 15 Outline of HPF Discussion

Foil 16 Information on HPF and HPF Forum (HPFF)

Foil 17 Possible Programming Models

Foil 18 Data Parallel Programming Model

Foil 19 Parallelism in HPF

Foil 20 Fortran77 is part of Fortran90

Foil 21 HPF Features

Foil 22 What gives high performance in HPF

Foil 23 Compiler directives used in HPF

Foil 24 What does an HPF Compiler do?

Foil 25 Syntax of HPF Directives

Foil 26 Data Mapping in HPF

Foil 27 Staged Data Mapping in HPF

Foil 28 Template in HPF

Foil 29 Abstract Processors in HPF

Foil 30 Example of Template and Processors

Foil 31 Align Directive in HPF

Foil 32 Examples of Align Directive

Foil 33 Changing Rank in Align Directive

Foil 34 Replication in Align Directive

Foil 35 General Alignments in HPF

Foil 36 Formal Definition of Align Directive

Foil 37 More obscure Complicated Examples of Align Directive

Foil 38 Distribution Directive in HPF

Foil 39 Basic Examples of Distribute Directive

Foil 40 Two Dimensional Example of Distribute Directive

Foil 41 The Two Basic Distributions in HPF

Foil 42 The Example of Matrix Inversion

Foil 43 Example of Graphics Rendering

Foil 44 Example of Distribute Directive with Complex Alignment

Foil 45 Dynamic Data Mapping

Foil 46 Advanced Mapping Directives -- ReDistribution and ReAlign

Foil 47 Advanced Mapping Directives -- Allocatable arrays and pointers

Foil 48 Subprograms in HPF

Foil 49 Passing Distributed Arrays as Subprogram Arguments in HPF

Foil 50 Mapping Options for Dummy (Subroutine) Arguments

Foil 51 Inherit Distribution Directive in HPF

Foil 52 Summary of Mapping Directives in HPF

Foil 53 Fundamental Parallelism Assumption in HPF

Foil 54 Parallel statements and Constructs in HPF

Foil 55 Parallelism in Fortran 90 array assignments

Foil 56 WHERE (masked array assignment) in HPF

Foil 57 WHERE...ELSEWHERE / IF...ELSE constructs in HPF

Foil 58 Intrinsic functions in HPF

Foil 59 HPF library functions

Foil 60 SUM, SUM_PREFIX and SUM_SCATTER defined

Foil 61 HPF Intrinsic EXAMPLE: SUM

Foil 62 FORALL Statement in HPF

Foil 63 Examples of FORALL statements in HPF

Foil 64 Semantics of the FORALL statement in HPF

Foil 65 Vector Indices in FORALL's

Foil 66 Multiple Statement FORALL's

Foil 67 HPF FORALL construct Pictorially

Foil 68 PURE Functions in HPF

Foil 69 Example of PURE Function from Chuck Koelbel

Foil 70 The INDEPENDENT Assertion in HPF

Foil 71 !HPF$ INDEPENDENT FORALL Pictorially

Foil 72 !HPF$ INDEPENDENT DO Pictorially

Foil 73 !HPF$ INDEPENDENT, NEW Variable

Foil 74 Extrinsics in HPF

Foil 75 High Performance Fortran HPF2 Changes

Foil 76 ON HOME for Computation Placement

Foil 77 Reductions in INDEPENDENT DO Loops

Foil 78 Spawning Tasks in HPF

Foil 1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to High Performance Fortran and Fortran 90