next up previous
Next: 3. Usability and Engineering Up: II. Programming Environments for Previous: 1. Overview

2. Memory Hierarchy-Latency-Bandwidth-Geometry

Let us first discuss the memory hierarchy-latency-bandwidth-geometry ``fundamental'' issues. I believe these are clearest for PIM architecture (in our list of three architectures) for classic ``geometric physical simulations.'' Most identified Petaflops applications are of this class even though they are more dynamic and irregular than today's such problems. Here, machine and problem structure are well matched, and conventional geometric decompositions should be effective. For all architectures, it is important to study data movement in classic algorithms - conjugate gradient, multigrid, FFT (etc.) and find efficient primitives. Although irregular dynamic problems will be technically harder to implement, I expect that study of the simpler regular problems will reveal essential issues. Good software for Petaflops machines will be built around efficient data movement primitives implemented as native runtime. HPF provides an example of a set of primitives, but it is unlikely these will be sufficient - except possibly for the PIM - as both HPF (and MPI) express just one level of memory hierarchy. As discussed at the PetaSoft meeting, research is needed on how to express the memory hierarchy, and the useful collective and point-to-point data movement and computation primitives (see Figure 1). This research should address both geometric problems, those like convolutions (FFT) with ``long-range'' structured data movement, and the presumably rather different parallel database, and Web server style applications.

 


Figure 1:

An area which may have promise is extension of classic ``load balancing'' and data decomposition to memory management on Petaflops architectures. Many of the current powerful load-balancing algorithms can naturally deal with problem and computer hierarchy from either a computational graph (with different levels of contraction) or a physical (different resolution) point of view. One may be able to base adaptive memory movement on such algorithms.

The PetaSoft meeting exposed the need for a layered software model that presented a coherent virtual machine at of each level, but allowed user or system to ``escape'' into a lower more complex layer when needed for either performance or functionality.


next up previous
Next: 3. Usability and Engineering Up: II. Programming Environments for Previous: 1. Overview

Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu