This chapter is divided into three sections. In the first section we
describe methods to improve performance of the sequential portions of
the linear solver implementations. In the other two sections, we
describe in detail the implementations of the parallel sparse matrix
solvers. Pseudo-code descriptions of the parallel algorithms are
presented in appendix .
The parallel implementation presented in this chapter has been developed as an instrumented proof-of-concept to examine the efficiency of each section of the algorithm. The host processor is used to gather and tabulate statistics on the multi-processor calculations. Statistics are gathered in a manner that do not impact the total empirical measures of timing data for factorization, forward reduction, or backward substitution in the implementation of the direct solvers, nor do statistics impact the total measures of performance for the parallel Gauss-Seidel.