next up previous
Next: Parallel Computing for Up: Parallel Block-Diagonal-Bordered Sparse Linear Previous: Conclusions

Algorithm Performance on Future SPP Architectures

We design and implement algorithms on existing hardware; however, for industrial applications such as power systems network analysis, it is equally important to predict algorithm performance for future architectures. Performance predictions for future architectures will help determine whether or not it will be cost-effective to port critical software to parallel architectures now or to simply wait and get speedup in the future from faster single processor computers.

This analysis is a good case in point --- performance for the parallel block-diagonal-bordered sparse solvers developed here is rather good on the Thinking Machine CM-5 for moderate number of processors (2--32). For Choleski solver applications, the parallel block-diagonal-bordered Gauss-Seidel algorithm yields good speedups and offers substantial algorithmic speedup when compared with parallel block-diagonal-bordered direct solvers. However, in this section we show that the superb computation-to-communication ratio available on the CM-5 using low-latency active messages will probably not be equaled in future architectures where processor performance increases significantly. Performance of our parallel Gauss-Seidel algorithm is latency dependent, due to the large number of small messages. Meanwhile, performance of our parallel direct algorithm is bandwidth dependent, due to the limited number of moderate size messages.

We show in this chapter that while the bandwidth-dependent parallel sparse block-diagonal-bordered direct solvers may port to future architectures with equal or better performance, the latency-dependent parallel sparse block-diagonal-bordered Gauss-Seidel solvers may not. While future architectures will have greater bandwidth than the Thinking Machines CM-5, they will not have a comparable reduction in communications latency. Any algorithmic performance gains possible with the parallel Gauss-Seidel algorithm would not be realized on future architectures that do not have the computation-to-communication ratio available on the CM-5.

We open this chapter by discussing future computing architectures and the requirements of the power utility industry in section gif. We introduce overhead-based performance estimates, in section gif, that we developed to predict algorithm performance on future high-performance computing architectures. We apply these estimation techniques to both sparse parallel block-diagonal-bordered direct and iterative solvers developed in this research in sections gif and gif. Due to the poor performance of the parallel iterative solver on future SPP architectures, we include comments on improving the latency performance of SPP communications in section gif, and in section gif, we reiterate the significant conclusions for porting our parallel linear solvers to future SPP architectures.





next up previous
Next: Parallel Computing for Up: Parallel Block-Diagonal-Bordered Sparse Linear Previous: Conclusions



David P. Koester
Sun Oct 22 17:27:14 EDT 1995