Full HTML for

Basic foilset Part B:Overview of Programming Paradigms and Relation to Applications

Given by Geoffrey C. Fox at CRPC/MCNC Workshop on April 10-13 1995. Foils prepared April 7,1995
Outside Index Summary of Material


This module describes many current approaches including different languages which support message passing, data parallelism and task parallelism. We describe the status of various approaches and what software is appropriate for what problems and what machines
We describe High Performance Fortran and what features are needed for what applications as well as
Special needs of coarse grain task parallelism

Table of Contents for full HTML of Part B:Overview of Programming Paradigms and Relation to Applications

Denote Foils where Image Critical
Denote Foils where Image has important information
Denote Foils where HTML is sufficient

1 What software is suitable for what problems?
2 What Applications have we learnt from ?
3 Comparison of 3 different Programming Models
4 What software systems are appropriate for what problem architectures -- I?
5 What software systems are appropriate for what problem architectures -- II?
6 Candidate Software Paradigms for each problem architecture
7 Problem v. Machine Architecture
8 Software Built on Top of FORTRAN, C ...
9 Evaluation of High Performance Fortran What applications need what features of HPF and its extensions ?
10 What Issues should High Performance Fortran (HPF) Address!
11 Goal of High Performance Fortran
12 Any Complete Programming Environment Must Handle
13 HIGH PERFORMANCE FORTRAN COMPILERS
14 What type of compiler is HPF ?
15 The High Performance Fortran Library
16 HPF Intrinsic Library
17 High Performance Fortran Library -- I
18 High Performance Fortran Library -- II
19 Fortran 90 Local Routine Intrinsics
20 Imprecise Mapping of Problem Classes into Runtime and Language Terms
21 General Applicability of HPF, HPF++, HPC++
22 Importance of HPF, HPC++ to Users
23 What about other languages ?
24 What applications does HPF support? If not - what extensions are needed?
25 5 Categories of Problems
26 HPF+: Extensions to HPF -- Use name HPF+ so don't predjudice "official" HPF2
27 Original Classification used in Planning CRPC (Maryland Rice Syracuse) HPF extensions
28 What can current HPF Language Surely do?
29 HPF can also do the synchronous
30 Current HPF can also do the Embarassingly Parallel
31 Difficult but (almost) possible for HPF
32 HPF can express Region Growing in Image Processing
33 HPF can also express irregular domains seen near critical points of physical systems
34 Swendsen-Wang clusters (boundaries shown in black) for 3 state Potts model at Tc
35 Significant improvement in HPF needed but seems possible for Particle in the Cell
36 Significant improvement in HPF needed but seems possible for .. (contd)
37 Some very hard Loosely Synchronous Problems -- HPF Expression uncertain
38 Large N-Body Calculations (Quinn, Salmon, Warren)
39 10,000 Body Barnes-Hut Tree
40 8M bodies - 10 Mpc diameter Final state with ~700 resolved "galaxies" (Warren, Quinn, Zurek)
41 The Largest "Galaxy" Halo with 137,000 particles taken from 8.8 Million particle simulation of Warren,Fullagar,Quinn and Zurek
42 Speed Up on nCUBE of Parallel Barnes Hut Algorithm
43 Final Summary of Problem and Software Architectures
44 What determines when Parallelism is Clear ?
45 The map of Problem ---> Computer is performed in two or more statges
46 The Mapping of Space of Problem Architectures onto Space of Machine Architectures
47 We can divide problems and machines into interconnected modules - and we do this at different granularities - I
48 We can divide problems and machines into interconnected modules - and we do this at different granularities - II
49 Different Grain Sizes for MetaProblems and Interpreter
50 Software Integration -- Support of Coarse Grain Tasks (Metaproblems) using AVS
51 AVS as System Integration Tool
52 A distributed parallel computing environment using AVS module network
53 Case Studies in Integrating AVS into HPDC Applications -- Stock Option Pricing
54 System Integration and Data Flows for financial modeling on a mix of Workstations, CM5 and Maspar
55 Option Price Modeling Screen Dump
56 Electromagnetic Simulation using AVS Screen Dump
57 Case Studies in Integrating AVS into HPDC Applications -- Electromagnetic Simulation
58 Physical Problem and Domains studied in Electromagnetic Simulation
59 Mapping of Electromagnetic Simulation onto MetaComputer
60 Setup for Computational Electromagnetic AVS Simulation
61 Data Assimilation -- NASA Grand Challenge Kalman Filters to combine weather models and data
62 AVS Distributed Computing Setup for Data Assimilation
63 Summary of Outstanding Issues in Programming Paradigms
64 Integrating Role of ANDF and possible HPANDF
65 Some Issues in Programming Paradigms
66 Some Different Approaches for Software Coordination
67 Questions in Comparison of AVS and PVM
68 Software Integration Questions?

Outside Index Summary of Material



HTML version of Basic Foils prepared April 7,1995

Foil 1 What software is suitable for what problems?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 2 What Applications have we learnt from ?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
See Generic and Specific Application Glossaries
As nearly all large scale parallel machines are distributed memory, experience is largely
  • Fortran, C plus message passing
  • CM Fortran
  • CM Fortran generalizes to High Performance Fortran (HPF) or HPC++ (remarks independent of language details)
Explicit Message Passing is only available way on "all" MIMD machines for generating good performance although this is at cost of significant user effort
  • user must be given programming models that are portable and scalable - user must "protect investment"
HPF and Massage Passing both require substantial rewriting of code
If existing or modestly modified Fortran 77, C, C++ codes would run "well" on parallel machine, then users would be happy
  • No evidence yet that this feasible on "large" global address space machines
Many users want to use workstation clusters as well as dedicated parallel machines
  • Mesage Passing allows this

HTML version of Basic Foils prepared April 7,1995

Foil 3 Comparison of 3 different Programming Models

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 4 What software systems are appropriate for what problem architectures -- I?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
This classification is summarized in Parallel Computing Works Survey
Synchronous:Regular in Space and Time
  • Tightly Coupled Synchronous
  • e.g. Seismic modeling, "Find proton mass", Matrix algebra
High Performance Fortran (HPF), CMFortran, Crystal, C*, APL are natural and can be automatically parallelized on any machine.
  • Data or Geometric Parallelism
Asynchronous:Irregular in Space and Time
  • Loosely Coupled Asynchronous
  • Object oriented languages: Strand, PCN, Linda, CC++
  • Process or Functional Parallelism
  • Need more examples!
  • "No special structure" to be exploited
  • Use statistical load balancing and decomposition methods

HTML version of Basic Foils prepared April 7,1995

Foil 5 What software systems are appropriate for what problem architectures -- II?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Loosely Synchronous: Irregular in Space, Regular in Time
  • Tightly coupled, Dynamic, Loosely Synchronous
  • e.g. Finite elements
  • Particle dynamics
Fortran + Message Passing "works but is not portable
Extended HPF (HPF+) designed to do most of these examples
Data or geometric parallelism again but dynamic data structures which are not arrays
  • Need new runtime libraries (CHAOS and PCRC)
  • Need new data structures
  • Load-balancing "solved in principle" if we can find appropriate data structures and global operations

HTML version of Basic Foils prepared April 7,1995

Foil 6 Candidate Software Paradigms for each problem architecture

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Synchronous: Parallel Fortran such as High Performance Fortran, CC++,Fortran-M, pC++, Crystal, APL..
Loosely Synchronous: Extensions of the above. (HPF+, HPC++)
Asynchronous: Web Technology, PCN, Linda, C++ object-oriented approaches....
  • Event Driven Simulation Packages where appropriate (written in say CC++)
Metaproblems: AVS, MOVIE, PCN, Linda, ADA, Fortran-M, CC++, .... controlling modules written in synchronous or loosely synchronous approach
Embarrassingly Parallel: Several approaches work?
  • AVS, MOVIE, PCN, Linda, Network Express, ISIS, PVM .......
Message Passing works in ALL cases!

HTML version of Basic Foils prepared April 7,1995

Foil 7 Problem v. Machine Architecture

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 8 Software Built on Top of FORTRAN, C ...

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
CC++, MOVIE (Furmanski at Syracuse), Linda, Fortran-M, PCN (Chandy)...
  • Elegant way of building tasks written in sequential F77, C, or parallel High Performance Fortran.
  • Suitable for problems which are loosely coupled collections of sequential or synchronous and loosely synchronous parallel modules (metaproblems)
Parallel Fortran, C*, Fortran 90, ...
  • Extend regime of Fortran, C from sequential into "tightly coupled" parallel synchronous and loosely synchronous parallel modules problems
These approaches tackle different problem architectures
  • Complementary not alternatives

HTML version of Basic Foils prepared April 7,1995

Foil 9 Evaluation of High Performance Fortran What applications need what features of HPF and its extensions ?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
See NPAC's High Performance Fortran Applications Resource

HTML version of Basic Foils prepared April 7,1995

Foil 10 What Issues should High Performance Fortran (HPF) Address!

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Should not solve ALL Problems
Goal: Express in a high level portable scalable fashion those aspects of problems one would like to use Fortran for
  • Need not solve all issues Roughly express all data parallelism and needed task parallelism
  • Integrate other programming paradigms to address remaining issues (see PCRC - Parallel Compiler Runtime Consortium)
Lessons from study of
  • CMFORTRAN experience
  • Some initial HPF experience
  • Fortran + Message Passing Experience
Particle Dynamics and Parallel Differential Equation Solving have been studied in detail (for HPF). Other fields less completely understood.

HTML version of Basic Foils prepared April 7,1995

Foil 11 Goal of High Performance Fortran

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Express in a high level portable scalable fashion those aspects of problems one would like to use Fortran for
  • Need not solve all issues Roughly express all data parallelism and needed task parallelism
  • Integrate other programming paradigms to address remaining issues (probably C++ can be better used for "difficult" cases. This implies integrated multilanguage support.)
Lessons from study of
  • CMFORTRAN experience
  • Some initial HPF experience
  • Fortran + Message Passing Experience
Particle Dynamics and Parallel Differential Equation Solving have been studied in detail (for HPF).
  • Image Processing quite well understood due to ARPA iWARP activity (CMU and elsewhere)
  • Other fields less completely understood.

HTML version of Basic Foils prepared April 7,1995

Foil 12 Any Complete Programming Environment Must Handle

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Data Parallelism
  • Completely handled in extended HPF?
Task Parallelism
  • Partially handled in (extended) HPF? e.g.
    • Embarrassingly Parallel Task Parallelism OK
    • Event Driven Simulations not possible
Metaproblems (task parallelism where each component data parallel)
  • Should we handle directly by; say, adding Fortran-M Syntax to HPF
  • or embed HPF in HPC++, Fortran-M (viewed as "outside" HPF), AVS ....
  • Metaproblems are discussed in Parallel Computing Works

HTML version of Basic Foils prepared April 7,1995

Foil 13 HIGH PERFORMANCE FORTRAN COMPILERS

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
See HPF Forum or National Software Exchange
List or NPAC List for educational resources
HPF is designed so that parallelism is essentially explicit
This is embodied in 2 classes of operations:
  • INTRINSICS - 74 library routines
    • these would be coded in most efficient fashion possible for target machine - whether global address or message passing, eg., ALL, ANY, CSHIFT, SUM, ...
    • represent a set of primitives higher level than message passing which implement functionality needed in most (any) parallel processing systems
  • PARALLEL STATEMENTS - FORALL / array assignments, INDEPENDENT
    • here compile time analysis leads to efficient parallel implementation
    • HPF will again invoke a set of runtime library routines
    • need most efficient, not most convenient routines

HTML version of Basic Foils prepared April 7,1995

Foil 14 What type of compiler is HPF ?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
HPF makes much greater use of runtime optimizations than the traditional sequential or parallel compiler
Fortran 77: Do 1 I=1,N 1 A(I) = FUNC(B(I), B(I +/- 1), C(I), C(I +/- 1), .....)
Traditionally arrange access to these elements in COMPILER
HPF: A=FUNC(B,C)
Invoke ALREADY OPTIMIZED parallel implementation of FUNC In many cases, HPF compiler need not know anything about parallelization except interfaces, data layout etc.

HTML version of Basic Foils prepared April 7,1995

Foil 15 The High Performance Fortran Library

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Much of HPF's Parallel "Power" is buried in its library
Personal note: Fortran 90D motivated by similarities between Caltech message passing "collective" communication and F90 Intrinsics
18 Non-elemental (true array) Fortran 90 Intrinsics e.g. CSHIFT, SUM
7 new HPF Intrinsics e.g. Alignment, Template, Distribution inquiries
4 new HPF Reduction (combine) Functions
11 new HPF Combine-Scatter Functions
22 new HPF Parallel Prefix Functions
2 new HPF Parallel Sorting Functions
3 other new HPF Parallel Functions
7 HPF Local Intrinsics
74 library routines (==> ~1000 routines when written in clumsy F77)

HTML version of Basic Foils prepared April 7,1995

Foil 16 HPF Intrinsic Library

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Non-Elemental Fortran 90 Intrinsics
  • 1. ALL(MASK,DIM)
  • 2. ANY(MASK,DIM)
  • 3. COUNT(MASK,DIM)
  • 4. CSHIFT(ARRAY,SHIFT,DIM)
  • 5. DOT_PRODUCT(VECTOR_A,VECTOR_B)
  • 6. EOSHIFT(ARRAY,SHIFT,BOUNDARY,DIM)
  • 7. MATMULMATRIX_A, MATRIX_B
  • 8. MAXLOC(ARRAY,MASK)
  • 9. MAXVAL(ARRAY,DIM,MASK)
  • 10. MINLOC(ARRAY,MASK)
  • 11. MINVAL(ARRAY,DIM,MASK)
  • 12. PACK(ARRAY,MASK,VECTOR)
  • 13. PRODUCT(ARRAY,DIM,MASK)
  • 14. RESHAPE(SOURCE,SHAPE, PAD,ORDER)
  • 15. SPREAD(SOURCE,DIM,NCOPIES)
  • 16. SUM(ARRAY,DIM,MASK)
  • 17. TRANSPOSE(MATRIX)
  • 18. UNPACK(VECTOR,MASK,FIELD)
HPF Intrinsics
  • 1. NUMBER_OF_PROCESSORS(DIM)
  • 2. PROCESSORS_SHAPE()
  • 3. Extensions of MAXLOC and MINLOC
  • 4. Integer length ILEN
  • 5. Alignment Inquiry Intrinsic Subroutine HPF_ALIGNMENT
  • 6. Template Inquiry Intrinsic Subroutine HPF_TEMPLATE
  • 7. Distribution Inquiry Intrinsic Subroutine HPF_DISTRIBUTION

HTML version of Basic Foils prepared April 7,1995

Foil 17 High Performance Fortran Library -- I

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
New Reduction Functions
  • 1. IALL
  • 2. IANY
  • 4. PARITY
    • 3. IPARITY
Combining- Scatter Functions
  • 2. COUNT_SCATTER
  • 3. PRODUCT_SCATTTER
  • 1. SUM_SCATTER
  • 4. MAXVAL_SCATTER
  • 5. MINVAL_SCATTER
  • 6. IALL_SCATTER
  • 7. IANY_SCATTER
  • 8. IPARITY_SCATTER
  • 9. ALL_SCATTER
  • 10. ANY_SCATTER
  • 11. PARITY_SCATTER
Sorting Functions
  • 1. GRADE_UP
  • 2. GRADE_ DOWN

HTML version of Basic Foils prepared April 7,1995

Foil 18 High Performance Fortran Library -- II

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Parallel Prefix Functions
  • 1. SUM_PREFIX
  • 2. COUNT_PREFIX
  • 3. PRODUCT_PREFIX
  • 4. MAXVAL_PREFIX
  • 5. MINVAL_PREFIX
  • 6. IALL_PREFIX
  • 7. IANY_PREFIX
  • 8. IPARITY_PREFIX
  • 9. ALL_PREFIX
  • 10. ANY_PREFIX
  • 11. PARITY_PREFIX
  • 12. SUM_SUFFIX
  • 13. COUNT_SUFFIX
  • 14. PRODUCT_SUFFIX
  • 15. MAXVAL_SUFFIX
  • 16. MINVAL_SUFFIX
  • 17. IALL_SUFFIX
  • 18. IANY_SUFFIX
  • 19. IPARITY_SUFFIX
  • 20. ALL_SUFFIX
  • 21. ANY_SUFFIX
  • 22. PARITY_SUFFIX
Other Functions
  • 1. POPCNT
  • 2. POPPAR
  • 3. LEADZ

HTML version of Basic Foils prepared April 7,1995

Foil 19 Fortran 90 Local Routine Intrinsics

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
1. GLOBAL_ALIGNMENT
2. GLOBAL_DISTRIBUTION
3. GLOBAL_TEMPLATE
4. ABSTRACT_TO_PHYSICAL
5. PHYSICAL_TO_ABSTRACT
6. LOCAL_TO_GLOBAL
7. GLOBAL_TO_LOCAL

HTML version of Basic Foils prepared April 7,1995

Foil 20 Imprecise Mapping of Problem Classes into Runtime and Language Terms

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
See Parallel Computing works General Discussion
STATIC
  • Synchronous and Embarassingly Parallel Problems - current HPF
ADAPTIVE
  • Loosely Synchronous but not Synchronous - future capabilities of High Performance Fortran (HPF+) but can be supported well in message passing
ASYNCHRONOUS
  • Asynchronous
INTEGRATION
  • Metaproblems
  • AVS works well but also can be integrated into languages such as HPC++, Fortran-M

HTML version of Basic Foils prepared April 7,1995

Foil 21 General Applicability of HPF, HPF++, HPC++

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 22 Importance of HPF, HPC++ to Users

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Standardized syntax lowers risk in application development
  • Can "guarantee" that code will run on future parallel machines with reasonable performance i.e. HPF, HPC++ are scalable standards
HPF, HPC++ benchmarks will enable easier comparison between different machines
  • Future procurements should (could) require HPF and HPC++?
Allows vendors to concentrate their scarce software resources
  • "All" parallel machines should offer HPCC Technologies
    • HPF
    • HPC++
    • MPI (Message Passing)
    • OSF etc.

HTML version of Basic Foils prepared April 7,1995

Foil 23 What about other languages ?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Parallelism is a feature of problems
It can be expressed in different languages with trade-offs in performance, flexibility and convenience
Fortran 77 Þ HPF
Fortran 90 Þ HPF
C++ Þ HPC++
  • Limitations (focus) of Fortran allow higher performance implementations
HPC++ more naturally handles complex data structures e.g. data parallel collections which are not the staple array data structure of Fortran
Must have common runtime support
  • Explored by PCRC - Parallel Compiler Runtime Consortium

HTML version of Basic Foils prepared April 7,1995

Foil 24 What applications does HPF support? If not - what extensions are needed?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
The following foils are expanded in HPF Applications Resource at NPAC
Somewhat related issue ?
  • What problems are suitable for HPF and make good use of MIMD architectures ?
  • i.e. run well on MIMD but poorly on SIMD
Classification of Problems (Fox, 1988)

HTML version of Basic Foils prepared April 7,1995

Foil 25 5 Categories of Problems

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
See overview discussion in Parallel Computing Works
Synchronous: Data Parallel Tightly coupled and software needs to exploit features of problem structure to get good performance. Comparatively easy as different data elements are essentially identical.
Loosely Synchronous:
  • As above but data elements are not identical. Still parallelizes due to macroscopic time synchronization.
Asynchronous:
  • Functional (or data) parallelism that is irregular in space and time. Often loosely coupled and so need not worry about optimal decompositions to minimize communication. Hard to parallelize (massively) unless ....
Embarrassingly parallel:
  • Essentially independent execution of disconnected components. (can involve reductions)
Metaproblems
  • Asynchronous collection of (loosely) synchronous components where these programs themselves can be parallelized

HTML version of Basic Foils prepared April 7,1995

Foil 26 HPF+: Extensions to HPF -- Use name HPF+ so don't predjudice "official" HPF2

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
We know the following areas MUST be addressed for a really useful HPF compiler
We roughly know how to do all of this and why
Support for irregular data distributions
  • unstructured, block-structured, adaptive...
  • Ab inito (static) partitioners
  • Dynamic including incremental partitioners
Enhancements to Forall and similar constructs
Sparse and tree data structures
Interface/Coupling with other languages (e.g., C++)
Parallel I/O
Heterogeneous systems
Some sort of support of or interface with task or perhaps control parallelism

HTML version of Basic Foils prepared April 7,1995

Foil 27 Original Classification used in Planning CRPC (Maryland Rice Syracuse) HPF extensions

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 28 What can current HPF Language Surely do?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Synchronous Problems where
Data Structure is an array manipulated
  • by intrinsic ( shift, multiply, add, sum .... )
  • and forall
Regular grid partial differential equation solvers
  • e.g. finite difference
"Crystalline" (regular) particle dynamics or Monte Carlo
  • QCD
  • Spin Systems
These above two are local or nearest neighbor - use CSHIFT
Full matrix Algorithms (may need MIMD for good performance)
FFT
Low Level Image Processing
  • SIMD architecture fine

HTML version of Basic Foils prepared April 7,1995

Foil 29 HPF can also do the synchronous

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
O(N2) Particle Dynamics
  • Basic Algorithm: The particles are rotated in a clockwise fashion
  • Represented in HPF using forall or CSHIFT
  • Optimization required for combining several shifts together to achieve efficiency in forall implementation
  • Difficult to optimize the CSHIFT Implementation

HTML version of Basic Foils prepared April 7,1995

Foil 30 Current HPF can also do the Embarassingly Parallel

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
In Embarrassingly Parallel Problems, the basic entities can be very complicated but they can be calculated independently.
  • Use DO INDEPENDENT command in HPF
Search databases for given text (can be SIMD)
Analyze separately 10**7 events recorded from Synchrotron (SSC) collisions. ( definitely MIMD "farm" )
Calculate matrix elements for (chemistry) Hamiltonian (may need MIMD)
Programs such as MOPAC, GAUSSIAN
  • "Calculate matrix elements"
  • "Multiply/Solve/Find eigenvalues of matrix"
  • Similar issues in Computational Electromagnetics
Quantum Monte Carlo
  • Independent "walkers"
  • Parallel Histogram (database) storage (reduction again)

HTML version of Basic Foils prepared April 7,1995

Foil 31 Difficult but (almost) possible for HPF

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Molecular Dynamics
  • Cutoff (bonded) forces (Fij = 0, |ri-rj|>d) and long range (O(N2) non- bonded) forces
Basic steps in the algorithm:
  • 1) calculate the non-bonded interaction list. This is done by N2 algorithm. The efficient algorithms based on spatial decomposition are not easily represented in HPF.
  • 2) Iterate several times
    • Calculate bonded interactions
    • Calculate non-bonded interactions
  • Both of the above steps can be represented in HPF
  • Extensions needed:
    • Specification of Partitioner
    • (and potentially incremental partitioners)
    • Extensions of Forall for performing reduction
Unstructured Finite Element Meshes
  • Data structure involves array of pointers
  • Use Forall extended for performing reductions
  • Interface with static (and incremental) partitioners for optimized performance
Performance of SIMD or Autonomous SIMD (Maspar) versus MIMD unclear for these irregular applications

HTML version of Basic Foils prepared April 7,1995

Foil 32 HPF can express Region Growing in Image Processing

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Basic Steps:
  • 1) Split phase: regular and can be done in HPF
  • 2) Merge phase: can be performed in HPF. However efficiency depends on the image (random tie breaking to improve the efficiency)
    • Extensions needed for a Reduce in forall.
    • Runtime support for Dynamic Partitioning and Movement of nodes to improve efficiency
Optimization not trivial - for instance on the CM-5
  • CMFortran version of algorithm 10 times slower than "completely equivalent" Fortran+message passing
  • Not certain why - we will use commercial HPF compilers to investigate
  • SIMD implementation reasonable

HTML version of Basic Foils prepared April 7,1995

Foil 33 HPF can also express irregular domains seen near critical points of physical systems

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Cluster algorithms for Monte Carlo spin systems
This is a very similar problem to region finding in image processing
Dominant problem is component labelling - can be done using "scan" -- intrinsic library support
Data structure is an array with a
difficult (irregular) parallel
Algorithm hidden in a library call.

HTML version of Basic Foils prepared April 7,1995

Foil 34 Swendsen-Wang clusters (boundaries shown in black) for 3 state Potts model at Tc

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 35 Significant improvement in HPF needed but seems possible for Particle in the Cell

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Multiple phase problems such as
  • Particle in cell (PIC)
  • (Un)structured multigrid
Needs to express "alignment" between phases (different levels of multigrid, particles and grid for PIC) and then optimize distribution and redistribution between phases.
Irregular problems of this type are only suitable for MIMD.
Very irregular cases with mismatches between phases perform poorly even on MIMD.

HTML version of Basic Foils prepared April 7,1995

Foil 36 Significant improvement in HPF needed but seems possible for .. (contd)

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Particle in the Cell -- Basic steps of algorithm
  • 1) Assignment Phase: For each particle assign the charge and current density to grid point:
    • Can be represented in HPF.
  • Distribution of Grid and Particle have to be specified.
    • Grid is partitioned using regular decomposition
    • Particles using Coordinate Bisection
  • 2) Solve Maxwell equations on the grid.
    • Regular problem can be specified in HPF
  • 3) Find Force on each particle using interpolation.
    • (Here the geometry has to be built in the algorithm!)
    • Can be specified in HPF using Extensions to Forall for reduction
  • 4) Move the particle
Irregularly Coupled Regular Meshes (ICRM) -- Basic Algorithm
  • 1) Iterate over each of the meshes independently on a subset of processors.
    • HPF extensions for specifying subset of processors.
  • 2) Interactions between meshes
    • Block structured runtime support

HTML version of Basic Foils prepared April 7,1995

Foil 37 Some very hard Loosely Synchronous Problems -- HPF Expression uncertain

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Direct methods for sparse matrix solvers
  • Hard to parallelize
O(N) and O(NlogN) fast multipole methods for particle dynamics
  • Data structure is an adaptive tree aligned with underlying spatial topology - hard to parallelize but performance is wonderful
  • Definitely need MIMD to get good performance
Original message passing code (Salmon thesis) won "Intel Delta Application Prize"
  • Impossible in HPF or extension
Current version (Warren and Salmon) looks possible with extended PARTI
Interesting Parallel C++ work

HTML version of Basic Foils prepared April 7,1995

Foil 38 Large N-Body Calculations (Quinn, Salmon, Warren)

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Clustering algorithm (Appel, Barnes-Hut, Greengard)
Can replace M body force calculation by one using center of mass of cluster
  • Naive calculation has complexity 0.5 N2 t2 particle
  • Clustering has complexity (20-50) N log N t2 particle and is better when N >1000
Can do O(N=10,000) particles with O(N2) algorithm
Three orders of magnitude larger problem possible with clustering algorithm

HTML version of Basic Foils prepared April 7,1995

Foil 39 10,000 Body Barnes-Hut Tree

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 40 8M bodies - 10 Mpc diameter Final state with ~700 resolved "galaxies" (Warren, Quinn, Zurek)

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 41 The Largest "Galaxy" Halo with 137,000 particles taken from 8.8 Million particle simulation of Warren,Fullagar,Quinn and Zurek

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 42 Speed Up on nCUBE of Parallel Barnes Hut Algorithm

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 43 Final Summary of Problem and Software Architectures

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 44 What determines when Parallelism is Clear ?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Key idea behind data-parallel languages - and perhaps all good languages
The language expresses problem and not the machine architecture
Need different approach to express functional parallelism
Use"object-oriented" feature when problems has natural objects
Do not use "object-oriented" features when objects are artifacts of target machine
Need data and functional (object) parallel paradigms in many problems - especially multidisciplinary

HTML version of Basic Foils prepared April 7,1995

Foil 45 The map of Problem ---> Computer is performed in two or more statges

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Architecture of "virtual problem" determines nature of language

HTML version of Basic Foils prepared April 7,1995

Foil 46 The Mapping of Space of Problem Architectures onto Space of Machine Architectures

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 47 We can divide problems and machines into interconnected modules - and we do this at different granularities - I

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
1)High Performance Low Latency Links
  • For machine this corresponds to classic MPP such as Paragon or Cray T3D
  • For problem, this for example we take QCD
    • Components are Gluon fields
    • Links are all local
Software model in this case is High Performance Fortran or
Fortran plus Message Passing

HTML version of Basic Foils prepared April 7,1995

Foil 48 We can divide problems and machines into interconnected modules - and we do this at different granularities - II

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
2) Flexible, "medium" performance links with high functionality. This case has examples:
For Machine: Network example is Hardware Bus
For Problem: Components are different programs in a complicated multidisciplinary design environment
For Software: Model is "Software bus" or "Distributed Computing"
  • (MACH communication)
Another software example is AVS, MOVIE (Syracuse) ..............

HTML version of Basic Foils prepared April 7,1995

Foil 49 Different Grain Sizes for MetaProblems and Interpreter

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Can divide a project into
  • "highly optimized" v. "flexible"
  • at different levels
HPF Compiler
  • Component is full HPF Program
  • Glue full programs together on bus
HPF Interpreter
  • Component is single MATLAB/APL/HPF array statement
  • Glue data parallel coarse grain array statement and "function" calls together

HTML version of Basic Foils prepared April 7,1995

Foil 50 Software Integration -- Support of Coarse Grain Tasks (Metaproblems) using AVS

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 51 AVS as System Integration Tool

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Basic Features
  • dataflow model for distributed computing
  • AVS program = network of AVS modules, controlled by the AVS kernel via the RPC mechanism
  • AVS module = C/Fortran program, conforming to AVS interface specification
  • strong visualization and 3D graphics support
  • visual editing tools for AVS networks
Highlights
  • easy to use and to port existing applications to the AVS framework
  • growing popularity in research community
  • growing volume of public domain modules, maintained by the International AVS Center at the NCSC
Flaws
  • poor interactivity (best for complex, static visualization, but not for simulation or animation)
  • lack of organization principle to bookkeep dataflow modules (not an object-oriented system)
  • lack of support for High Performance Computing
  • not an Open Systems software (2 competitive commercial products: AVS, Explorer (SGI); 2 public domain systems: apE and Khoros)

HTML version of Basic Foils prepared April 7,1995

Foil 52 A distributed parallel computing environment using AVS module network

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 53 Case Studies in Integrating AVS into HPDC Applications -- Stock Option Pricing

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Stock Option Price Modeling
Data Parallel Pricing Models on CM5 and DECmpp-12000 (Maspar MP-1)
An Interactive Visualization Environment on Heterogeneous Computing System
The Distributed System Configuration and Integration
The Graphical User Interface
Discussion

HTML version of Basic Foils prepared April 7,1995

Foil 54 System Integration and Data Flows for financial modeling on a mix of Workstations, CM5 and Maspar

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 55 Option Price Modeling Screen Dump

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 56 Electromagnetic Simulation using AVS Screen Dump

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 57 Case Studies in Integrating AVS into HPDC Applications -- Electromagnetic Simulation

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Electromagnetic Scattering Simulation
The Simulation Model and Physical Parameters
Major Computational Issues
  • Task Decomposition
  • Selection of Simulation Parameters
  • GUI Design
  • System Integration and Concurrent Control
  • Interstage Communications
  • Performance Requirement Analysis
A General Performance Model of the Remote Visualization Environment
EMS Subtasks and System Implementation
The GUI and Inter-module communication
Discussion

HTML version of Basic Foils prepared April 7,1995

Foil 58 Physical Problem and Domains studied in Electromagnetic Simulation

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 59 Mapping of Electromagnetic Simulation onto MetaComputer

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 60 Setup for Computational Electromagnetic AVS Simulation

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Basic equations calculate matrix elements and then apply linear solvers

HTML version of Basic Foils prepared April 7,1995

Foil 61 Data Assimilation -- NASA Grand Challenge Kalman Filters to combine weather models and data

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 62 AVS Distributed Computing Setup for Data Assimilation

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 63 Summary of Outstanding Issues in Programming Paradigms

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index

HTML version of Basic Foils prepared April 7,1995

Foil 64 Integrating Role of ANDF and possible HPANDF

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Discussion of HPF is "general discussion of data parallelism"
HPC++ uses data parallelism from HPF
Parallelism from problem not language !

HTML version of Basic Foils prepared April 7,1995

Foil 65 Some Issues in Programming Paradigms

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Distinguish
  • Metaproblem Expression
    • AVS certainly
    • PVM maybe
  • Metacomputer Expression
    • MPI certainly
    • PVM maybe
Data parallel
  • Metaproblem - High Performance Fortran
  • Metacomputer - Fortran + (C)MPI

HTML version of Basic Foils prepared April 7,1995

Foil 66 Some Different Approaches for Software Coordination

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
AVS - Coarse grain dataflow and commercial support and many image processing modules
Linda - Shared database accessed by messaging and commercial support
Speedes - Event driven simulation i.e. temporal synchronization
PCN, C++, Fortran-M, ...
  • - More flexible general approaches
  • - Harder to use and standardize ?

HTML version of Basic Foils prepared April 7,1995

Foil 67 Questions in Comparison of AVS and PVM

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
What is Relation of AVS
  • (Coarse grain message driven data flow with visual interface
and PVM ?
  • (more general but lower level message passing)
What is Relation of PVM and
  • (PC)MPI Current "standard" parallel computing message passing interface
  • (DC)MPI Future "standard" distributed computing message passing interface
  • Multimedia Standards for National Information Infrastructure
???????

HTML version of Basic Foils prepared April 7,1995

Foil 68 Software Integration Questions?

From Programming Paradigms B CRPC/MCNC Workshop -- April 10-13 1995. *
Full HTML Index
Should we build Integrated Systems
Fortran-M ==> HPF-M
or keep subsystems with distinct functionally
HPF + AVS
or
HPF + PVM
data task

Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Sun Feb 22 1998