next up previous
Next: Low-Latency Communications Up: Block-Diagonal-Bordered Power System Previous: Block-Diagonal-Bordered Direct Linear

Block-Diagonal-Bordered Iterative Linear Solvers

Even though Gauss-Seidel algorithms for dense matrices are inherently sequential, it is possible to identify sparse matrix partitions without data dependencies so calculations can proceed in parallel while maintaining the strict precedence rules in the Gauss-Seidel technique [35,36]. All data parallelism in our Gauss-Seidel algorithm is derived from within the actual interconnection relationships between elements in the matrix. We employed two distinct ordering techniques in a preprocessing phase to identify the available parallelism within the matrix structure:

  1. partitioning the matrix into block-diagonal-bordered form,
  2. multi-coloring the last diagonal matrix block.
The same diakoptic node-tearing-based network partitioning used to order matrices into block-diagonal-bordered form for direct linear methods has been used to identify available parallelism within the irregular sparse power systems matrices for our parallel Gauss-Seidel implementation,

Node-tearing-based partitioning identifies the basic network structure that provides parallelism for the majority of calculations within a Gauss-Seidel iteration. Meanwhile, without additional ordering, the last diagonal block would be purely sequential, limiting the potential speedup of the algorithm in accordance with Amdahl's law. The last diagonal block represents the interconnection structure within the equations that couple the partitions found in the block-diagonal-bordered matrix. Graph multi-coloring has been used to order this matrix partition and subsequently identify those rows that can be solved in parallel.

We implemented explicit load balancing as part of each of the aforementioned ordering steps to maximize efficiency as the parallel Gauss-Seidel algorithm is applied to real power system load-flow matrices. An attempt was made to place equal amounts of processing in each partition, and in each matrix color. The metric employed when load-balancing the partitions is the number of floating point multiply/add operations, not simply the number of rows per partition. Load-balancing for the parallel Gauss-Seidel algorithm is sufficiently effective that relative speedups greater than 20 have been observed in empirical performance measurements for iterative solvers on a 32 processor Thinking Machines CM-5.



next up previous
Next: Low-Latency Communications Up: Block-Diagonal-Bordered Power System Previous: Block-Diagonal-Bordered Direct Linear



David P. Koester
Sun Oct 22 17:27:14 EDT 1995