NPAC Technical Report SCCS-745
Parallel Block-Diagonal-Bordered Sparse Linear Solvers for Power Systems Applications
David Koester
Submitted October 31 1995
Abstract
This thesis presents research into parallel linear solvers for
block-diagonal-bordered sparse matrices. The block-diagonal-bordered
form identifies parallelism that can be exploited for both direct and
iterative linear solvers. We have developed efficient parallel
block-diagonal-bordered sparse direct methods based on both LU
factorization and Choleski factorization algorithms, and we have also
developed a parallel block-diagonal-bordered sparse iterative
method based on the Gauss-Seidel method. Parallel factorization
algorithms for block-diagonal-bordered form matrices require a
specialized ordering step coupled to an explicit load balancing step
in order to generate this matrix form and to distribute the
computational workload uniformly for an irregular matrix throughout a
distributed-memory multi-processor. Matrix orderings are performed
using a diakoptic technique based on node-tearing-nodal analysis.
Parallel Gauss-Seidel algorithms for block-diagonal-bordered form
matrices require a two-part matrix ordering technique --- first to
partition the matrix into block-diagonal-bordered form, again, using
the node-tearing diakoptic techniques and then to multi-color the data
in the last diagonal block using graph coloring techniques. The
ordered matrices have extensive parallelism, while maintaining the
strict precedence relationships in the Gauss-Seidel algorithm.
Empirical performance measurements for real power system networks are
presented for implementations of a parallel block-diagonal-bordered LU
algorithm, a similar Choleski algorithm, and a parallel
block-diagonal-bordered Gauss-Seidel algorithm run on a distributed
memory Thinking Machines CM-5 multi-processor. We have compared the
performance of the direct and iterative parallel implementations on
the CM-5, and show that significant algorithmic speedup may be
possible for the Gauss-Seidel algorithm versus Choleski factorization
for positive definite matrices. We have developed a simple technique
that uses empirical data to predict the performance of these
algorithms on future architectures. We apply these techniques to
develop algorithm performance predictions for future Scalable Parallel
Processing (SPP) architectures.