In this section, the parallel implementation of the ParaMoM is presented.
We discuss the CM-5 implementation in detail and give brief information
about
the Intel and IBM implementations as well. This section is organized following
the order of the diagram
in Figure 4.4.
In Section , the details of
the parallel implementation of the setup code are given and the CMMD
global
synchronous
broadcast I/O mode is introduced to perform parallel I/O for setup.
In Section
,
a parallel precomputation algorithm is presented
and a pseudo code of the algorithm is given. In Section
,
a data-parallel algorithm for filling the moment matrix is given. The matrix filling
is implemented with Fortran 77 and CMMD message passing library on the CM-5
system, Fortran 77 and NX message-passing library or PVM on the Intel,
and Fortran
77 and PVM on the IBM SP-1. In
Section
, we discuss a very flexible algorithm to compute
the excitation vectors either sequentially or in parallel, depending on the
problem. In Section
, a data-parallel implementation of the
Gaussian elimination linear system solver is presented.
In Section
, the parallel implementation of the RCS code
is discussed and a pseudo code of the parallel algorithm is given.
Finally, in Section
, the
parallel out-of-core fill algorithm is presented.