Graphs of relative speedup calculated from empirical performance data are provided in figure 7.15 for the two parallel iterative solver implementations. Each figure has a family of speedup curves for the five power systems networks examined in this research. Each curve plots relative speedup for 2, 4, 8, 16 and 32 processors. This analysis of parallel block-diagonal-bordered Gauss-Seidel speedup has used the best sequential algorithm to collect sequential execution performance data time. Research showed that there was a significant difference in the performance of a general sequential Gauss-Seidel algorithm and the first version of the parallel block-diagonal-bordered sparse Gauss-Seidel solver. Modifications to the first parallel Gauss-Seidel algorithm yielded an algorithm that is performance competitive with the best sequential Gauss-Seidel algorithm.
These figures illustrate that parallel performance of the double precision parallel Gauss-Seidel implementation can be as much as 17 for 32 processors and 21 for the complex variate implementation. The best speedups were obtained with the BCSPWR10 power systems network, and nearly as good relative speedups were obtained for the BCSPWR09 and NiMo-OPS networks. The effects of load-imbalance overhead, described in the previous section, clearly affect the relative speedup for the EPRI6K and NiMo-PLANS networks. Performance for all networks is similar for two through sixteen processors; however, there is a radical change in the rate of increase for relative speedup with 32 processors for these two planning matrices.
Figure 7.15: Relative Speedup --- Parallel Gauss-Seidel