We compare the performance of parallel Choleski solvers with parallel iterative Gauss-Seidel solvers by determining the number of iterations for the parallel Gauss-Seidel given a number of (re)uses. Families of curves plotting the number of iterations versus the number of dishonest (re)uses are presented in figure 7.23 for one through ten reuses and one through 32 processors for power systems networks: BCSPWR09 and BCSPWR10. The shape of the curves show that the largest number of iterations possible for a constant time solution occur for a single use of the factored matrix. As the factorization is (re)used, the cost to factor the matrix is amortized over the additional (re)uses. For large numbers of factorization (re)uses, the curve becomes asymptotic to
where:¯For the other power systems network matrices examined in this research, performance is similar to these graphs.
is the time for a single (parallel) Choleski triangular solution.
is the time for a single (parallel) Gauss-Seidel iteration.
Figure 7.23: Gauss-Seidel Iterations as a Function of Dishonest Reuses of the Choleski Matrix
The graph for the BCSPWR09 operations matrix in figure 7.23
illustrates that on a single processor, 12 Gauss-Seidel iterations
take as much time as a single factorization and triangular solution.
Meanwhile, only four iterations per solution would equal the time for
10 dishonest (re)uses. However, when 32 processors are
utilized, 54 Gauss-Seidel iterations could be performed in the same
time as a single direct solution, and 24 iterations per solution for
10 dishonest (re)uses. The graph for the BCSPWR10 operations
matrix in figure 7.23 illustrates even greater numbers of
iterations --- nearly 120 Gauss-Seidel iterations could be performed
in the same time as a single direct solution for 32 processors, and 55
iterations per solution for 10 dishonest (re)uses. These
comparisons are for a convergence check every four iterations.
Previous discussions on Gauss-Seidel convergence in
section 7.2.3. have concluded that after twelve
iterations, total error is less than . Only eight
iterations are required for six decimal place accuracy with data sets
generated for actual sparse power systems networks. Given that there
are good starting points for each successive iterative solution, there
is a strong possibility that the use of parallel Gauss-Seidel should
yield significant algorithmic speedups for diagonally dominant or
positive definite sparse matrices. For these two cases, such speedups
could be as high as a factor of ten for large data sets.