As with ordering matrices into block-diagonal-bordered form for direct
methods, we are looking for an ordering technique that produces a
permutation matrix that transforms the irregular power
systems matrix into block-diagonal-bordered form while balancing the
workload in the diagonal blocks and also limiting the number of
coupling equations in the last diagonal block. Minimizing the size of
the last diagonal block in a parallel block-diagonal-bordered sparse
matrix minimizes the amount of communications; however, we illustrate
in section 7.2 with empirical data that there are
trade-offs with minimizing the number of equations in the last
diagonal block and balancing the workload in the block-diagonal
portion of the matrix. If the size of the last block of the matrix
can be adequately constrained, the amount of communications can be
drastically reduced. When determining the optimal ordering for a
sparse matrix, an ordering that yields a better load-balance in the
highly parallel portion of the calculations may be traded for the size
of the last diagonal block and the subsequent additional
communications.
The method we have chosen to order the sparse power systems networks
into block-diagonal-bordered form matrices for parallel Gauss-Seidel
is a specialized form of diakoptics, referred to as node-tearing
[12,26,49]. We have performed an analysis similar
to that described above to order power systems matrices into
block-diagonal-bordered form for direct methods, and have found that
node-tearing nodal analysis has features that make it superior to
recursive spectral bisection and nested dissection --- techniques that
attempt to load-balance on the number of rows/columns in a partition,
rather than the actual workload in the irregular matrix. The
node-tearing form of diakoptics analysis attempts to extract the
natural structure in the matrix or graph, and generally produces many
irregularly sized blocks, while minimizing the size of the lower
border and last diagonal block. Additional detail on the node-tearing
form of diakoptics is presented in appendix . Load
balancing techniques must be used in a separate step after the
node-tearing ordering step to distribute the processing load uniformly
onto a multi-processor.
When ordering power systems networks into block-diagonal-bordered form
for the parallel Gauss-Seidel method, there is no concern for
additional ordering of the diagonal blocks to minimize fillin, because
iterative methods do not generate these additional non-zero values.
However, additional ordering is required for the last diagonal block,
because without ordering, the calculations in this portion of the
matrix would be purely sequential, limiting the potential speedup of
the algorithm in accordance to Amdahl's law. The last diagonal block
represents the interconnection structure within the equations that
couple the partitions found in the node-tearing-based ordering step.
In other words, the variables in the last-diagonal block are the
interconnections within the equations that tie the entire matrix
together. Graph multi-coloring has been used for ordering this
portion of the matrix --- all nodes of the same color share no
interconnections, consequently, the values of
in these rows can be calculated in any order without violating the
strict precedence rules in the Gauss-Seidel method. As a result, rows
within a color can be solved in parallel, and barriers to synchronize
parallel operations are only required between graph colors.
The multi-coloring algorithm we selected for this work is based on the
saturation degree ordering algorithm. We also require load balancing,
a feature not commonly implemented within graph multi-coloring. As
part of our implementation we added a feature that equalizes the
number of rows per color to provide load-balancing for communications.
The graph multi-coloring technique is discussed in greater detail in
appendix .