Next: Fortran With Message Passing Up: No Title Previous: Parallel Computing

Parallel Languages

Using a parallel machine requires rewriting code written in standard sequential languages. We would like this rewrite to be as simple as possible, without sacrificing too much in performance. Parallelizing large codes involves substantial effort, and in many cases rewriting code more than once would be impractical. A good parallel language therefore needs to be portable and maintainable, that is, the code should run on effectively all current and future machines (at least those we can anticipate today). This means that the language should be scalable, so that it can work effectively on machines using one or millions of processors. Portability also means that programs can be run in parallel over different machines across a network (distributed computing).

There are some completely new languages specifically designed to deal with parallelism, for example occam, however none are so compelling that they warrant adoption in precedence to adapting existing languages such as Fortran, C, C++, Ada, Lisp, Prolog, etc. This is because users have experience with existing languages, good sequential compilers exist and can be incorporated into parallel compilers, and migrating existing code to parallel machines is much easier. In any case, to be generally usable, especially for scientific computing, any new language would need to implement the standard features and libraries of C and Fortran. [7, 8, 9]

The purpose of software, and in particular computer languages, is to map a problem onto a machine. [10, 4] An application in computational science starts out with some physical problem, goes via theory to a model of the physical process, then to an algorithm or numerical method for solving or simulating the model, which is then expressed in a high level language which is targeted to a virtual computer, and finally implemented by a compiler and systems software onto a real computer. Some information is lost in each of the above steps in translating the problem to the machine. The goal of good software should be to make this translation as simple as possible, and to minimize the loss of information.

A drawback of current software is that it is often designed around the machine architecture, rather than the problem architecture. Each class of problem architectures requires different general constructs from the software. It is possible for compilers to construct an approximate computational graph from a dependency analysis of sequential code (such as Fortran 77), and extract parallelism in this way, however this is not usually very effective. In many cases the parallelism inherent in the problem will be obscured by the use of a sequential language or even a sequential algorithm. A particular application can be parallelized efficiently if, and only if, the details of the problem architecture are known. Users know the structure of their problems much better than compilers do, and can create their algorithms and programs accordingly. If the data structures are explicit, as in Fortran 90, then the parallelism becomes much clearer.

Currently there are two language paradigms for distributed memory parallel computers: message passing and data parallel languages. Both of these have been implemented as extensions to Fortran and C. Here we will concentrate on Fortran.

Next: Fortran With Message Passing Up: No Title Previous: Parallel Computing

Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu