Next: Machine and Problem Up: No Title Previous: Problem Architectures

Some Software and Machine Issues

Naturally parallel implementations work ``best'' if the machine architecture is ``similar'' to that of the problem. This is summarized in Table 5 where to be precise, success requires that the machine architecture ``contains'' (is a superset of) the problem architecture. Thus, both SIMD and MIMD machines express synchronous problems, but SIMD machines are typically unsuitable for loosely synchronous problems.

Table 5: What is the ``Correct'' Machine Architecture for each Problem Class?

Software systems need to be designed so that they can express problems well, and be targeted to relevant machines. Software should not be designed for a particular machine model---it expresses problem and not machine characteristics.

Table 6: Candidate Software Paradigms for Each Problem Architectures

We have described those issues at length in [Fox:90p;91g;94a], and here we will just present a simple table (Table 6) mapping the five problem architectures into possible software environments. This is presented in a different fashion for HPF and HPC++ in Figure 9 and Table 7, which also points out the distinct runtime support needed for each problem class. One always has a tradeoff between performance and flexibility. Systems listed under ``asynchronous'' in Table 6 can typically also be used for synchronous and loosely synchronous problems. As shown in Figure 10, the ``asynchronous'' software used on loosely synchronous problems will probably provide greater flexibility, but lower performance than software systems explicitly designed for this problem class.

Table 7: Imprecise Mapping of Problem Classes into Runtime and Language Terms

Figure 9: General Applicability of HPF, HPF+, HPC++ Classified by Problem Architecture and type of Runtime Support needed

Figure 10: Mapping of Asynchronous, Loosely Synchronous, and Synchronous Levels or Components of Machine, Software and Problem. Each is pictured hierarchically with the asynchronous level at the top and synchronous components at lowest level. Any one of the components may be absent.

Loosely synchronous problems are in some sense the hardest as they have difficult irregularities which must be expressed with high efficiency by the underlying compiler and runtime systems. We, and others, have discussed this at length, both in general [Choudhary:92d;92e], [Fox:90p], [Goil:94a;95a],

, and in case of High Performance Fortran [Bogucz:94a], [Chapman:94b], [Cheng:94e], [Choudhary:92g;94c], [Fox:94g], [Hawick:95a;95c], [HPF:94a], [HPFapp:95a], [Joubert:95a], [Muller:95a], [Robinson:95a], [Sturler:95a].

Note that Figure 9 refers to ``HPF+''---this is some extension, called officially HPF2 (and later 3 perhaps) of HPF [HPF:93a], [HPFF:95a] to fill gaps in the original language. The current HPF1 handles most synchronous and embarrassingly applications, but requires extension to handle the adaptive irregular data structures typical of loosely synchronous problems.

We now quantify these remarks with three case studies, which will link the material of Sections 2,3, and 4.

Next: Machine and Problem Up: No Title Previous: Problem Architectures

Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu