next up previous
Next: Parallel Sparse Matrix Up: The Preprocessing Phase Previous: Pseudo-Solution

Load Balancing

The load balancing step of the preprocessing phase can be performed with the same simple pigeon-hole type algorithm described above for direct methods in section gif. The load-balancing algorithm uses a metric based on the numbers of floating point multiply/add operations in a partition, not simply the number of rows per partition. The independent nature of calculations in the diagonal blocks and the border permit a processor to start updating the last diagonal block as soon as that processor has completed calculating in the diagonal blocks. The parallel calculations in the last diagonal block are load balanced separately, by dividing the rows within colors evenly among processors to minimize communications overhead. These metrics do not consider indexing overhead, which can be rather extensive when working with very sparse matrices stored in an implicit form.

We have found that load-balancing for the parallel block-diagonal-bordered Gauss-Seidel algorithm has been easier to perform than load-balancing for direct methods, because workload per partition has less variability for iterative methods than direct methods. Workloads within a partition for direct methods have a higher computational complexity than for iterative methods, so for the same partitioned matrix, there may be significantly different distributions of data to processors for the two methods.



David P. Koester
Sun Oct 22 17:27:14 EDT 1995