Efficient parallel sparse matrix solvers require that any disparities
in processor workloads be minimized in order to minimize load
imbalance overhead, and consequently, to maximize processor
utilization. The metric for performing load balancing must be based
on the actual workload required by the processors in a
distributed-memory multi-computer. The metric employed when
load-balancing the partitions for parallel Gauss-Seidel, is the number
of floating point multiply/add operations, not simply the number of
rows per partition. Determining the number of floating point
operations in each partition for the parallel Gauss-Seidel is simple
when compared to the pseudo-factorization required for parallel direct
methods. Pseudo-solving for an iteration examines the number of
operations when calculating in the matrix partitions
and the number of operations when calculating the sparse matrix vector
products in preparation to solve for
in the last
diagonal block. It simply requires that the number of off-diagonal
non-zero elements in all rows within a partition be summed. While
this step is simple and can be performed with significantly less
effort than pseudo-factorizations for parallel direct methods, it is
an essential input to the load-balancing step for the block-diagonal
portion of the matrix.