Parallel
architectures have been studied from user experience over a broad range of
problem classes [33][32][31].
In this section, we only review the parallel implementation of EM
scattering problems using the integral equation approach.
The parallel processing architecture application to electromagnetic
scattering has received attention since late in the last decade.
JPL/Caltech led
the research on Hypercube architecture application. Their work was
reported in
[43][42][41][40][39][38][37][36][35][34].
The first code they have implemented is the Numerical
Electromagnetics Code (NEC-2), developed at Lawrence Livermore National
Laboratory. The NEC-2 code, which used the wire-grid modeling approach, is
an extremely versatile general-purpose user-oriented
code. The code can treat complex wire configurations which model either
surfaces or multi-wire antennas in the frequency domain. The code was
implemented on a JPL/Caltech Mark III Hypercube (Hypercube architecture is
discussed in chapter 4). The Mark III Hypercube's configuration consists
of 32
processing nodes [39];
each node has a pair of Motorola
68020 processors-one is the main application processor and the second
is the communication processor. A Motorola 68881 floating point
coprocessor is added for floating point operation, and a new floating
point accelerator uses the Weitek chip set; each node has four
megabytes (Mbytes) main
memory with 128 kilobytes (Kbytes) cache memory.
The Mark III Hypercube delivers about
1 to 14 megaflops (Mflops)
per node in computation, 2.0 Mbytes per second per channel in
synchronous communication and 0.5 Mbyte per second and per channel in
asynchronous communication. The 32-node Mark III Hypercube can run cases in
core which consist of up to 2400 equations. At the end of the
last decade and the
beginning of this decade, JPL/Caltech upgraded the Mark III Hypercube to
a 128 node machine which can run cases in core which consist of 4800
unknowns [38][42][41][40]. Beginning this decade,
JPL/Caltech started to implement the PATCH code described in [15]
on the 128-node Mark
III Hypercube [36].
The PATCH code is a method of moments code which implements a discretization
of the electric field
integral equation (EFIE) for conducting objects of arbitrarily shaped
surfaces.
An object is modeled by a set flat triangular patches and Rao's basis functions
[14]. For a small problem, the parallel algorithm
implemented has been termed ``trivial
parallelization" because each processor executes identical code for
varying excitations. In the parallelization of large
problems, row decomposition is used and direct LU factorization is
employed. The PATCH code is also implemented on the Intel iPSC Hypercube
and
Touchstone Delta systems. The Hypercube system has been upgraded
to 64 processors,
and the Delta system is a two-dimensional mesh of 512
processors. Each processor has 16 Mbytes of RAM attached.
During the first year's operation of the Intel
Touchstone Delta system, Cwik [37] reported that it has solved for
scattering from a conducting sphere with
using the EFIE formulation out-of-core on a 512-node Delta machine.
We notice another parallel implementation of the integral equation method is a body of revolution code using the EFIE by Gedney [44] on Hypercube architectures. The parallel MoM algorithm in Gedney's code is based on the work presented by Gedney and Mittra [45] which was derived from original work done by Glisson and Wilton [46]. Column scattering decomposition is applied to map data onto the hypercube. Although this mapping scheme eliminates a significant amount of redundant integration computation, it requires a reshuffling of the matrix before the LU factorization is performed. They experienced that their parallel algorithm on a Coarse-Grained Hypercube has bad scalability, because the additional communication is required by reshuffling the matrix elements when the number of the processors becomes large. Gedney and Mittra have implemented their code not only on a Coarse-Grained Hypercube Mark III but on a Fine-Grained Hypercube (MIMD) nCUBE, which may employ up to 8K processors. In addition, Gedney and Mittra have implemented their code on a Fine-Grained SIMD Hypercube architecture, Thinking Machine's CM-2, which has 64K bit-serial processors.