next up previous
Next: Model Validation Up: Algorithm Performance on Previous: Overhead-Based Performance Estimates

Performance Predictions for Direct Solvers

 

We implemented two versions of the parallel block-diagonal-bordered sparse direct solver on the Thinking Machines CM-5 and the notable differences between the two implementations are the communications paradigms when updating the last diagonal block in the matrix. One communications paradigm uses low-latency, active message-based communications, and the other uses buffered communications. Active message-based communications on the CM-5 has latency of 1.6 second to send four words, while the buffered communications version of the algorithm utilizes the traditional CMMD communications library, which has 86 second latency and 0.12 second per word communications costs [6]. Both versions of the algorithm utilized the active message s-copy-based buffered communications for factoring the last diagonal block. S-copy communications has 23 second latency and 0.12 second per word communications costs [6]. The CM-5 has a multi-tiered communications network with 40 megabyte-per-second bandwidth at the lowest layer [6].





David P. Koester
Sun Oct 22 17:27:14 EDT 1995