Full HTML for

Basic foilset DoD HPF Training -- 6. HPF2

Given by Chuck Koelbel -- Rice University at DoD Training and Others on 1995-98. Foils prepared August 7 98
Outside Index Summary of Material


Sixth Presentation in Chuck Koelbel's HPF Tutorial
Covers whats new in HPF2 and changes in HPF1

Table of Contents for full HTML of DoD HPF Training -- 6. HPF2

Denote Foils where Image Critical
Denote Foils where Image has important information
Denote Foils where HTML is sufficient

1 "High Performance Fortran in Practice" tutorial 6. HPF 2.0 preview
Presented at Supercomputing '95, San Diego, December 4, 1995
Presented at University of Tennessee (short form), Knoxville, March 21, 1996
Revised and presented at High Performance Computing and Networking Europe, April 18, 1996
Presented at Metacenter Regional Alliances, Cornell, May 6, 1996
Presented at Summer of HPF Workshop, Vienna, July 1, 1996
Revised, expanded, and presented at Institute for Mathematics & its Applications, Minneapolis, September 11-13, 1994
Presented at Corps of Engineers Waterways Experiments Station, Vicksburg, MS, October 30-November 1, 1996
Presented at Supercomputing '96, Pittsburgh, PA, November 17, 1996
Presented at NAVO, Stennis Space Center, MS, Feb 13, 1997
Presented at HPF Users Group (short version), Santa Fe, NM, February 23, 1997
Presented at ASC, Wright-Patterson Air Force Base, OH, March 5, 1997
Parts presented at SC'97, November 17, 1997
Parts presented (slideshow mode) at SC '97, November 15-21, 1997
Presented at DOD HPC Users Group, June 1, 1998
Outline

2 HPF 2 Background
3 HPF 1.x Features
HPF 2.0 Features
(Language Laywer View)
HPF 2.0 Features (Technical View)
HPF 2.0 Deletions and Simplifications

4 Methods of Avoiding DYNAMIC Distributions
5 Sequence Association for Dummies
6 Methods of Avoiding Sequence Association
7 Methods of Rewriting Subroutine Interfaces
8 New Data Mapping Features
9 Examples of Extended Distributions
10 Example of Distribution to Subsets
11 Rules for Mapping Pointers
12 Implementation of HPF 2
DISTRIBUTE Patterns

13 Implementation of HPF 2
DISTRIBUTE Patterns (cont.)

14 New Parallel Execution Features
15 Example of Loop Reductions
16 Example of Computation Placement
17 Example of Locality Assertion
18 Example of Task Parallelism
19 GRADE_UP versus SORT_UP
20 Implementation of HPF 2 Parallel Features
21 Implementation of HPF 2 Parallel Features (cont.)
22 External Interfaces
23 Example of Asynchronous I/O
24 Conclusions
For More Information

Outside Index Summary of Material



HTML version of Basic Foils prepared August 7 98

Foil 1 "High Performance Fortran in Practice" tutorial 6. HPF 2.0 preview
Presented at Supercomputing '95, San Diego, December 4, 1995
Presented at University of Tennessee (short form), Knoxville, March 21, 1996
Revised and presented at High Performance Computing and Networking Europe, April 18, 1996
Presented at Metacenter Regional Alliances, Cornell, May 6, 1996
Presented at Summer of HPF Workshop, Vienna, July 1, 1996
Revised, expanded, and presented at Institute for Mathematics & its Applications, Minneapolis, September 11-13, 1994
Presented at Corps of Engineers Waterways Experiments Station, Vicksburg, MS, October 30-November 1, 1996
Presented at Supercomputing '96, Pittsburgh, PA, November 17, 1996
Presented at NAVO, Stennis Space Center, MS, Feb 13, 1997
Presented at HPF Users Group (short version), Santa Fe, NM, February 23, 1997
Presented at ASC, Wright-Patterson Air Force Base, OH, March 5, 1997
Parts presented at SC'97, November 17, 1997
Parts presented (slideshow mode) at SC '97, November 15-21, 1997
Presented at DOD HPC Users Group, June 1, 1998
Outline

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
1. Introduction to Data-Parallelism
2. Fortran 90/95 Features
3. HPF Parallel Features
4. HPF Data Mapping Features
5. Parallel Programming in HPF
6. HPF Version 2.0 *** Contents of This Presentation

HTML version of Basic Foils prepared August 7 98

Foil 2 HPF 2 Background

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
New HPFF meetings were held to develop extensions to HPF
  • Meetings began January 1995
  • Preliminary presentations at Supercomputing '95
  • Complete draft at Supercomputi '96
  • Finalized draft January 31, 1997
Areas for extensions
  • Control parallelism (hpff-task@cs.rice.edu)
  • Data distribution (hpff-distribute@cs.rice.edu)
  • External interfaces (hpff-external@cs.rice.edu)
Input is still welcome!
  • Send mail to majordomo@cs.rice.edu to get on a list
  • Send mail to hpff-interpret@cs.rice.edu to comment

HTML version of Basic Foils prepared August 7 98

Foil 3 HPF 1.x Features
HPF 2.0 Features
(Language Laywer View)
HPF 2.0 Features (Technical View)
HPF 2.0 Deletions and Simplifications

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
DYNAMIC, REALIGN, and REDISTRIBUTE
  • Now ³just² approved extensions
  • Removed due to complexity of implementation
Mapping in the presence of sequence association
  • Now forbidden in all cases
  • Removed due to complexity, and lack of user demand
Subroutine interfaces
  • Explicit interface required if mapping changes, or INHERIT used
  • Descriptive (³star²) syntax retained, primarily for error checking
  • Changed to match spirit of Fortran 95, and simplify user model

HTML version of Basic Foils prepared August 7 98

Foil 4 Methods of Avoiding DYNAMIC Distributions

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
REDISTRIBUTE/REALIGN had two main uses:
  • Changing access patterns dynamically
  • Picking static distributions based on data size
Changing access patterns
  • Declare multiple arrays and copy data between them
Picking a distribution based on input
  • Clone sections of the program

HTML version of Basic Foils prepared August 7 98

Foil 5 Sequence Association for Dummies

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Don't Use It!

HTML version of Basic Foils prepared August 7 98

Foil 6 Methods of Avoiding Sequence Association

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Just say "No!"
  • EQUIVALENCE was never a good idea for new HPF codes
  • Included for ease of porting F77 codes, with debatable benefits
For new codes:
  • Always declare arrays to be their natural rank
  • Use ALLOCATABLE to make arrays their natural size
  • Use MODULE for global arrays, or pass as explicit arguments
For porting codes:
  • Top-down conversion of subroutines
  • If subroutine really needs EQUIVALENCE, it may be better as an EXTRINSIC

HTML version of Basic Foils prepared August 7 98

Foil 7 Methods of Rewriting Subroutine Interfaces

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Always use an explicit interface
  • Create a MODULE with INTERFACE blocks
  • Create an INCLUDE file with INTERFACE blocks

HTML version of Basic Foils prepared August 7 98

Foil 8 New Data Mapping Features

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Extended Distribution Patterns
  • !HPF$ SHADOW x( 1, 0 )
  • !HPF$ DISTRIBUTE y( GEN_BLOCK( (/ 12,10,10,12 /) ) )
  • !HPF$ DISTRIBUTE z( INDIRECT(map_array) )
Distribution to processor subsets
  • !HPF$ PROCESSORS procs(1:np)
  • !HPF$ DISTRIBUTE b(BLOCK) ONTO procs(1:np/2-1)
Distribution of derived type components
  • TYPE set_of_meshes
    • REAL p(100,100), q(100,100), r(100,100)
    • !HPF$ DISTRIBUTE (BLOCK,*) :: p, q, r
  • END TYPE
  • TYPE(set_of_meshes) multi_block(32)
  • !!! Do not try to DISTRIBUTE array multiblock !!!
Rules for matching distributions (pointers, dummy parameters)

HTML version of Basic Foils prepared August 7 98

Foil 9 Examples of Extended Distributions

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ DISTRIBUTE x(BLOCK(SHADOW=1),*)
!HPF$ DISTRIBUTE y(GEN_BLOCK((/4,2,2,4/)),*)
!HPF$ DISTRIBUTE z(*,INDIRECT(map))

HTML version of Basic Foils prepared August 7 98

Foil 10 Example of Distribution to Subsets

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ PROCESSORS p(6)
!HPF$ DISTRIBUTE a(*,BLOCK) ONTO p
!HPF$ DISTRIBUTE b(*,BLOCK) ONTO p(1:3)
!HPF$ DISTRIBUTE c(*,BLOCK) ONTO p(4:6)

HTML version of Basic Foils prepared August 7 98

Foil 11 Rules for Mapping Pointers

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
In the core language of HPF 2
  • Pointers cannot be mapped
  • Targets of pointers cannot be mapped (i.e. variables with the TARGET attribute)
In the HPF 2 approved extensions
  • Pointers can be mapped
    • ALIGN and DISTRIBUTE take effect after ALLOCATE
    • INHERIT can be used to declare a pointer to ³anything²
  • Targets can be mapped
  • In a pointer assignment, the targetıs mapping must be a specialization of the pointer's
    • That is, the pointer has to be higher in the diagram than the thing it points at
    • If the target is an array section, the pointer must have the INHERIT attribute

HTML version of Basic Foils prepared August 7 98

Foil 12 Implementation of HPF 2
DISTRIBUTE Patterns

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Conceptually, the process remains the same
  • Allocate memory, adjust indexing and loops, handle nonlocal data
  • New patterns require more elaborate methods to achieve this
SHADOW
  • Add extra space to allocation
  • Use that space for buffering of nonlocal data and adjusting indices
  • Ignore that space for adjusting loop bounds
  • Simplifies addressing, may avoid copying
GEN_BLOCK
  • Keep table of block bounds on each processor
  • Search table to find home of nonlocal elements, adjust indices and loops
  • Allows some load balancing with locality

HTML version of Basic Foils prepared August 7 98

Foil 13 Implementation of HPF 2
DISTRIBUTE Patterns (cont.)

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
INDIRECT
  • Keep a copy of the map array distributed by BLOCK
  • Keep a list of all elements on the local processor for adjusting loop bounds
  • Inspector/executor strategy for locating nonlocal elements
    • Inspector: Gather all information needed from map array
    • Executor: Use this information to perform the computation
    • Only do the inspector once, if possible (i.e. when distributions and array access patterns stay exactly the same)
  • Allows arbitrary load balancing and communication reduction through partitioning
    • But finding the right partition is NP-complete

HTML version of Basic Foils prepared August 7 98

Foil 14 New Parallel Execution Features

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Loop Reductions
  • !HPF$ INDEPENDENT, NEW(xinc), REDUCTION(x)
  • DO i = 1, n
    • CALL sub(i, xinc)
    • x = x + xinc
  • END DO
Computation Placement
  • !HPF$ INDEPENDENT
  • DO i = 1, n
    • !HPF$ ON HOME( ix(i) )
    • x(i) = y(ix(i)) - y(iy(i))
  • END DO
Task Parallelism
  • !HPF$ TASK_REGION
    • !HPF$ ON HOME(p(1:8))
    • CALL foo(x,y)
    • !HPF$ ON HOME(p(9:16))
    • CALL bar(z)
  • !HPF$ END TASK_REGION

HTML version of Basic Foils prepared August 7 98

Foil 15 Example of Loop Reductions

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ INDEPENDENT, NEW(xinc), REDUCTION(x)
DO i = 1, n
  • CALL sub(i, xinc)
  • x = x + xinc
END DO

HTML version of Basic Foils prepared August 7 98

Foil 16 Example of Computation Placement

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ INDEPENDENT
DO i = 1, 12
  • !HPF$ ON HOME( ix(i) )
  • x(i) = y(ix(i))-y(iy(i))
END DO

HTML version of Basic Foils prepared August 7 98

Foil 17 Example of Locality Assertion

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ ALIGN (i) WITH x(i) :: ix, y, iy
!HPF$ DISTRIBUTE x(BLOCK)
!HPF$ INDEPENDENT
DO i = 1, n
  • !HPF$ ON HOME( ix(i) ), RESIDENT(y(iy(i)))
  • x(i) = y(ix(i)) - y(iy(i))
END DO

HTML version of Basic Foils prepared August 7 98

Foil 18 Example of Task Parallelism

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
!HPF$ PROCESSORS p(8)
!HPF$ DISTRIBUTE a1(block,*) ONTO p(1:4)
!HPF$ DISTRIBUTE a2(*,block) ONTO p(5:8)
!HPF$ TEMPLATE, DIMENSION(4), DISTRIBUTE(BLOCK) ONTO p(1:4) :: td1
!HPF$ ALIGN WITH td1(*) :: done1
!HPF$ TASK_REGION
  • done1 = .false.
  • DO WHILE (.true.)
!HPF$ ON HOME(p(1:4)) BEGIN, RESIDENT
    • READ (unit = iu,end=100) a1
    • CALL rowffts(a1)
    • GOTO 101
100 done1 = .true.
101 CONTINUE
!HPF$ END ON
    • IF (done1) EXIT
    • a2 = a1
!HPF$ ON HOME(p(5:8)) BEGIN, RESIDENT
    • CALL colffts(a2)
    • WRITE(unit = ou) a2
!HPF$ END ON
  • ENDDO
!HPF$ END TASK_REGION

HTML version of Basic Foils prepared August 7 98

Foil 19 GRADE_UP versus SORT_UP

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Sometimes you want to keep things together (GRADE_UP)
Sometimes you donıt (SORT_UP)

HTML version of Basic Foils prepared August 7 98

Foil 20 Implementation of HPF 2 Parallel Features

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
All new features represent useful information to the compiler
  • They do not change the meaning of the program if properly used
REDUCTION
  • Choose an efficient order of evaluation for the reduction tree
  • For scalar reductions, keep a local sum on each node and have a global combining phase at the end
    • Alternate implementation: critical region
  • For vector reductions, use same algorithm as XXX_SCATTER
ON HOME
  • Base the loop partitioning on the HOME expression
  • Invert subscripting function to derive loop bounds
  • Does not affect where communication/synchronization can be placed, but may change what must be communicated
  • Warning: You can outsmart the compiler this way‹to your detriment

HTML version of Basic Foils prepared August 7 98

Foil 21 Implementation of HPF 2 Parallel Features (cont.)

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
RESIDENT
  • Do not generate communication for the RESIDENT expressions (and look for logically equivalent expressions as well)
  • If no expression is given, then no communication for any variable is needed
    • Allows INDEPENDENT with CALL
TASK_REGION
  • Shared memory: Use barrier synchronization for tasks entering each ON block; also synchronize on shared data access outside ON blocks
  • Distributed memory: Processor groups communicate only internally within an ON block (use ordinary communication), communicate with other tasks at ON boundaries (use group communication

HTML version of Basic Foils prepared August 7 98

Foil 22 External Interfaces

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
Calling HPF from other languages
  • How to ensure HPF is called consistently?
Calling other languages from HPF
  • How to avoid unnecessary synchronization?
Parallel Input/Output
  • What do you mean by ³parallel I/O²?
  • Asynchronous I/O
Tools
  • Can we define a symbol table format for debuggers, etc.?

HTML version of Basic Foils prepared August 7 98

Foil 23 Example of Asynchronous I/O

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
i0 = 0; i1 = 1
READ (file0, ID=id0, END=100) a(i0,1:1048576)
DO
  • WAIT (ID=id0, END=100)
  • ! start next read into other row
  • itmp = i0; i0 = i1; i1 = itmp
  • READ (file0, ID=id0, END=100) a(i0,1:1048576)
  • ! overlap I/O with useful work
  • CALL PROCESSING( a(i1,1:1048576) )
END DO
100 CONTINUE

HTML version of Basic Foils prepared August 7 98

Foil 24 Conclusions
For More Information

From DoD HPF Training -- 6. HPF2 DoD Training and Others -- 1995-98. *
Full HTML Index
World Wide Web
These slides
Mailing Lists:
  • Write to majordomo@cs.rice.edu
  • In message body: subscribe <list-name> <your-id>
  • Lists: hpff, hpff-interpret, hpff-core
Anonymous FTP:
  • Connect to titan.cs.rice.edu
  • Full draft in public/HPFF/draft
  • See public/HPFF/README for latest file list

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Sun Aug 9 1998