next up previous
Next: III. Overview of Petaflops Up: II. Programming Environments for Previous: 2. Memory Hierarchy-Latency-Bandwidth-Geometry

3. Usability and Engineering in the Petaflops Programming Environment

We are perhaps pessimistic, but see no breakthroughs in usability. We expect that data parallel and message passing to be dominant forms of parallelism. We suggest that the ``data parallel'' approach could evolve with a different emphasis. One can view HPF as the language of identical operations on array elements - a limited concept as many parallel applications cannot be well expressed in this fashion. Rather, we view HPF as a high-level language that, through intrinsic functions, allows one to access a library of carefully tuned parallel algorithms. We believe this last view of HPF is the most promising and generalizable. Thus, our suggested usability model is through the greater use of interoperable parallel libraries. We assume that interpreted and/or graphical interfaces, such as APL, Matlab, or even Visual basic/VJ++ is nearer to the desired implementation than current HPF. In a layered view of the PetaSoft environment, we already mentioned the machine view with levels of the memory hierarchy supported by data movement software supporting ``escaping'' to lower levels. There is also a user's layered view described below (see Figure 2).

 


Figure 2:

a) Fully visual or scripted (interpreted) environment exhibiting domain specific functionality

This is optimized for user interface and only offers coarse grain access to capabilities in the fashion of AVS.

b) Partially scripted level offering

Portable flexible programming at some performance cost

c) Traditional compiled level

Offering a high-level language with few machine dependent features, and getting high performance - traditionally within about a factor of two of the peak performance possible on the particular algorithm.

d) Traditional machine specific level

Rarely used by application programmers or even those building (high level) tools. Clearly, allows user to obtain peak performance at the cost of a very inconvenient programming environment.

Examples in different domains of a) are Matlab (linear algebra, signal processing, etc.), JavaScript (document display), AVS (coarse grain dataflow), UNIX shell, and the growing number of Web interfaces. b) is illustrated by Java in Applet mode and Perl. c) is C, C++, Fortran, or compiled Java on sequential machines; such languages plus message passing or HPF on parallel machines. Looking at Petaflops machines, we see a critical problem that there is no natural correspondence between the hierarchy of machine levels (the virtual machine) and the hierarchy of problem levels (the virtual problem). The machine architecture affects programming levels a), b) and c). Level b) is the hardest with a key goal of designing a high-level language with minimal machine specific features - the analogies of HPF described in data parallel or MPI calls in message passing paradigm. I believe that it is unknown what the performance degradation (the hoped for worst case factor of two) will be obtained on Petaflops architectures with what user ``directives'' to optimize either/or parallelism or memory hierarchy. In fact, whereas it is almost certain that we can construct level a), it is not so obvious that the traditional high-level approach, b), will deliver effective performance. The ``fundamental'' research program outlined earlier is largely aimed at understanding possible approaches to levels a) and b).

The highest levels, c) and d), are less sensitive to the Petaflops architectures for several reasons. A simple observation is that often software at this high level will run on a client machine, and so by definition be of ``conventional'' architecture (of course, some ``Petaflops'' architectures are natural extrapolations of ``conventional'' architectures). The machines supporting levels c) and d) would be responsible for visualization with the user coding customized Java applets to analyze and display results computed on a Petaflops machine. We believe such a model is reasonable in general, and that implementation of levels c) and d) will be needed independent of the Petaflops initiative. However, there are two interesting points that link these high levels to Petaflops systems. Firstly the possible difficulties in designing and implementing a user friendly powerful level b), suggests that it may be particularly important to develop levels c) and d) into a relatively complete high-performance environment. This would access optimized libraries written by experts using relatively machine specific software at levels a) and b). This growing reliance on runtime libraries seems to me one of the few promising approaches to future HPCC software. Note that level a) and b) are not ``assemblers'' and ``compilers,'' but rather ``machine specific'' and ``largely architecture independent.''

There is a second, perhaps very pessimistic, reason to link the design and implementation of levels c) and d) with the Petaflops initiative. Thus, to do this programming environment ``right'' requires, we think, a break with the past. However, there will be a natural tendency to ``evolve'' existing approaches in the commercial mainstream. Thus, we see that the Petaflops initiative gives the opportunity to build a ``new generation'' programming environment or HPCC-NG of the type described earlier.

The success of HPCC-NG will largely depend on making good engineering designs - from adequate funding to careful design. Our personal prejudice would be to build HPCC-NG in terms of Web technologies - linked Web servers and clients with excellent support for Java as both a primary programming language and as a wrapper for other languages. Although much research and experimentation is needed to fill in the details of a Web technology-based HPCC-NG, enough is understood to scope out and start such an initiative. The overall framework would be a coarse grain software integration from environment (called WebFlow by me in the past) of which the principles are clear. We have started a community initiative to understand the harder and, in our opinion, more profound issues associated with the use of Java as the basic language for science and engineering simulations. However, as massive (Petaflops!) parallelism issues in Java will (as far as we can see now) be quite similar to those in C++ and Fortran, we don't think that one needs to understand or even believe in Java as a computational science language to initiate design and implementation of HPCC-NG.


next up previous
Next: III. Overview of Petaflops Up: II. Programming Environments for Previous: 2. Memory Hierarchy-Latency-Bandwidth-Geometry

Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu