Next: Scalability of Intrinsics Up: Experimental Results Previous: Portability

Scalibility

Distributed memory computers are characterized by their scalable architectures. These distributed-memory systems are expandable and can achieve a proportional performance increase without changing the basic architecture. In order to take full advantage of scalable hardware, the software must also be scalable to exploit the increased computing capability. This section presents benchmark results to illustrate that Fortran 90D/HPF generates scalable codes to exploit the scalable distributed memory machine.

All of these benchmarks were run on a 15-processor Intel Paragon. The processors run at 50 MHz, and each node has 32 MBytes of physical memory. The programs were compiled using the Fortran 90D/HPF compiler with all optimization turned on, including the i860 vectorizer which exploits single node parallelism.

The shallow water (shallow) benchmark is a small program (300 lines) abstracting a 2-dimensional flow system. The data is distributed in block fashion in one dimension, (*, block). The generated code consists of many computations of order with communication mostly consisting of overlap-shifts of order N. Figure shows the performance of shallow. The super-linear speed-up on the large data set dramatically exhibits the ability of the Fortran 90D/HPF compiler to make large problems more tractable simply through efficient use of the larger available core memory on a multi-processor system.

The partial differential equation benchmark (pde1) is a small program (360 lines) from the Genesis Parallel Benchmark Suite that implements a 3D Poisson Solver using red-black relaxation through five iterations. Figure shows the performance of pde1. The data is distributed block fashion in one dimension, (*,*,block). Good scalability is exhibited. The communication mostly consists of overlap-shifts due to the stencil computations of pde1.

The hydflo benchmark is a small hydrodynamics program (2000 lines). Figure shows the performance of hydflo. The data is distributed block fashion in one dimension, (*,*,block). Good scalability is exhibited. The communication mostly consists of copy-section and collective-communication.

As shown by the data, benchmark programs written in Fortran 90D/HPF can achieve reasonable efficiency given a problem of reasonable size. The figures show reasonably good scalability when increasing numbers of processors are used.



Next: Scalability of Intrinsics Up: Experimental Results Previous: Portability


zbozkus@
Thu Jul 6 21:09:19 EDT 1995