The Fortran 90D/HPF compiler is organized around several major units: parsing the language, partitioning data and computation, detecting communication and generating code.
The compiler transforms data distribution specifications found in the Fortran 90D/HPF source (decomposition, distribute, align) into predefined mathematical distribution functions that determine the partitioning of data on the distributed memory system. We developed an algorithm to compile align directives to minimize communications. The compiler maps the data on abstract processor grids, then maps the processor grid efficiently on the underlying hardware topology to reduce the importance of underlying topology.
The compiler must recognize the presence of communication
patterns in the computations in order to generate appropriate
communication calls. Specifically, this involves a number of tests
on the relationships among subscripts of various arrays in a statement.
We designed an algorithm to detect communications and to generate
appropriate collective communication calls to execute array assignment
and statements on distributed memory machines.
The Fortran 90D/HPF compiler relies on a powerful runtime support system. The compiler replaces some of the explicit parallelism with calls to the parallel runtime system. The runtime support system consists of functions which can be called from the node programs of a distributed memory machine. We developed an easy-to-use interface to the runtime system.
Our compiler performs several types of communication and
computation optimization to maximize the performance of the
generated code.
Communication optimization can be classified as
Communication Hierarchy, Vectorized Communication,
Message Aggregation, Evaluating Expression,
Communication Parallelization, Communications Union,
Eliminate Unnecessary Communications and
Reuse of scheduling information. In addition, some
computation optimizations are developed for sequentialization of
statements such as Dependency, Loop Interchange,
Mask Insertion.
Some of these optimizations are validated with an example.
Empirical measurements show that the performance of the output of the Fortran 90D/HPF compiler for real world application programs is comparable to that of corresponding hand-written codes on the Intel iPSC/860 and Paragon.
We have indicated our confidence in the performance of the code generated by the compiler by publishing the absolute execution times of our benchmarks. We believe that our Fortran 90D/HPF compiler greatly improves programmer productivity. Fortran 90D/HPF programs are shorter, easier to write, and easier to debug than programs written in Fortran 77 with message passing. We have found that Fortran 90D/HPF makes it much easier to tune nontrivial programs.