Given by Chuck Koelbel -- Rice University at DoD Training and Others on 1995-98. Foils prepared August 7 98
Outside Index
Summary of Material
First in series of Chuck Koelbel on HPF |
HPF and its performance |
Types of Parallel Computers |
Types of Applications |
Data Parallelism Message Passing |
Why use Compilers |
Outside Index
Summary of Material
Charles Koelbel |
Supported in part by
|
As always
|
The High Performance Fortran Forum |
The D System Group (Rice University) |
The Fortran 90D Group (Syracuse University) |
Ken Kennedy (Rice) |
David Loveman (DEC) |
Piyush Mehrotra (ICASE) |
Rob Schreiber (HP Labs) |
Guy Steele (Sun) |
Mary Zosel (Livermore) |
Defined by the High Performance Fortran Forum (HPFF) as a portable language for data-parallel computation |
History:
|
Influences:
|
All of Fortran 90 |
FORALL and INDEPENDENT |
Data Alignment and Distribution |
Miscellaneous Support Operations |
But:
|
Performance is highly dependent on the compiler and on the nature of the code
|
Commercial compilers are now competitive with MPI for regular problems
|
Research continues on irregular problems and task parallelism
|
A full application for ocean modeling
|
Well-known compact application benchmarks from NASA
|
World Wide Web |
These slides |
Mailing Lists:
|
Anonymous FTP:
|
1. Introduction to Data-Parallelism |
2. Fortran 90/95 Features |
3. HPF Parallel Features |
4. HPF Data Mapping Features |
5. Parallel Programming in HPF |
6. HPF Version 2.0 |
Parallel computers allow several CPUs to contribute to a computation simultaneously. |
For our purposes, a parallel computer has three types of parts:
|
Key points:
|
Every processor has a memory others can¹t access. |
Advantages:
|
Disadvantages:
|
All processors access the same memory. |
Advantages:
|
Disadvantages:
|
Combining the advantages of shared and disttributed memory |
Lots of hierarchical designs are appearing.
|
A parallel algorithm is a collection of tasks and a partial ordering between them. |
Design goals:
|
Sources of parallelism:
|
Data-parallel algorithms exploit the parallelism inherent in many large data structures.
|
Analysis:
|
Functional parallelism exploits the parallelism between the parts of many systems.
|
Analysis:
|
A parallel language provides an executable notation for implementing a parallel algorithm. |
Design criteria:
|
Usually a language reflects a particular type of parallelism. |
Data-parallel languages provide an abstract, machine-independent model of parallelism.
|
Advantages:
|
Disadvantages:
|
Abstractions like data parallelism split the work between the programmer and the compiler. |
Programmer¹s task: Solve the problem in this model.
|
Compiler¹s task: Map conceptual (massive) parallelism to physical (finite) machine.
|
Program is based on relatively coarse-grain tasks |
Separate address space and a processor number for each task |
Data shared by explicit messages
|
Examples: MPI, PVM, Occam |
Advantages:
|
Disadvantages:
|
Analysis
|
Computation Partitioning
|
Communication Introduction
|
Code Generation
|
Help analysis with assertions
|
Distribute array dimensions that exhibit parallelism
|
Consider communications patterns
|
Don't hide what you are doing |