next up previous
Next: Load Balancing Up: A New Three-step Previous: Ordering

Pseudo Factorization

As stated above, the metric for performing load balancing or for comparing the performance of ordering techniques must be based on the actual workload required by the processors in a distributed-memory multi-computer. Consequently, more information is required than just the locations of fillin as in previous work that used symbolic factorization to identify fillin for static data structures [9,12,24].

To accomplish the two-fold requirement for both identifying the location of fillin and determining the amount of calculations in each independent block, we propose that a pseudo factorization step be included in the preprocessing phase. Pseudo factorization is merely a replication of the numerical factorization process without actually performing the calculations. Counters are used to tally the numbers of calculations to factor the independent data blocks and the numbers of calculations to update the last block using data from the borders.

There is no way to avoid the computational expense of this preprocessing step, because the computational workload in factorization is not correlated with the number of equations in an independent block. The number of calculations when factoring an independent sub-block is a function of the number and location of non-zero matrix elements in that block --- not necessarily the number of equations in the block. The workload during the numerical factorization step may differ substantially from the number of equations assigned to processors. Efficient sparse matrix solvers require that any disparities in processor workloads be minimized in order to minimize load imbalance overhead, and consequently, to maximize processor utilization.



David P. Koester
Sun Oct 22 16:27:33 EDT 1995