Next: The Hierarchical Data
Up: Parallel Direct Methods for
Previous: The Node-tearing Implementation
Implementations of a block-diagonal-bordered sparse LU solver and a
similar Choleski solver have been developed in the C programming
language for the Thinking Machines CM-5 multi-computer using message
passing and a host-node paradigm. A version of the software is
available that runs on a single processor on the CM-5 to provide
empirical speed-up data to quantify multi-processor performance.
Empirical performance data has been gathered for a range of numbers of
processors and real power systems sparse network matrices. Results
based on empirical data collected in benchmarking trials are presented
in the next section. Our block-diagonal-bordered sparse solvers
have the following distinct sections where blocks are defined in
section 4:
- LU factorization
- factor the mutually independent diagonal blocks and associated portions of the border ---
,
, and
for (
)
- update the last diagonal block using the data in the borders ---
- factor the last diagonal block ---

- forward reduction
- calculate the y vector partition corresponding to the mutually independent
diagonal blocks ---
for (
)
- update the b vector partition corresponding to the last diagonal block ---
- calculate the y vector partition corresponding to the last diagonal block ---

- backward substitution
- calculate the x vector partition corresponding to the last diagonal block ---
- calculate the x vector partition corresponding to the mutually independent diagonal blocks ---
for (
)
The Choleski factorization algorithm is similar to LU factorization,
with the block-diagonal-bordered Choleski algorithm having the same
distinct sections as described above with the exception of
and
being substituted for
and
respectively.
The parallel implementation presented in this section has been
developed as an instrumented proof-of-concept to examine the
efficiency of each section of the code described above. The host
processor is used to gather and tabulate statistics on the
multi-processor calculations. Statistics are gathered in a manner
that do not impact the total empirical measures of performance for
factorization, forward reduction, or backward substitution.
Next: The Hierarchical Data
Up: Parallel Direct Methods for
Previous: The Node-tearing Implementation
David P. Koester
Sun Oct 22 15:31:10 EDT 1995