Significant research effort has been expended to examine parallel matrix
solvers --- for both dense and sparse matrices. Numerous papers have
documented research on parallel dense matrix solvers
[3,27,28], and these articles illustrate that
good efficiency is possible when solving dense matrices on
multi-processor computers. The calculation time complexity of dense
matrix LU factorization is , and there are sufficient,
regular calculations for good parallel algorithm performance. Some
implementations are better than others
[27,28], nevertheless, performance is deterministic for:
The bulk of recent research into parallel direct sparse matrix techniques has centered around symmetric positive definite matrices, and implementations of Choleski factorization. A significant number of papers concerning parallel Choleski factorization for symmetric positive definite matrices have been published recently [9,10,11,14]. These papers have thoroughly examined many aspects of the parallel direct sparse matrix solver implementations, symbolic factorization, and appropriate data structures. Techniques to improve interprocessor communications using block partitioning methods have been examined in [23,24,25,26]. Techniques for sparse Choleski factorization have even been developed for single-instruction-multiple-data (SIMD) computers like the Thinking Machines CM-1 and the MasPar MPP [16]. This discussion is by no means an exhaustive literature survey, although it does represent a significant portion of the direct sparse matrix research performed for vector and multi-processor computers.
References [9,10,11,14,23,24,25,26] have kept with a general two step preprocessing paradigm for parallel sparse Choleski factorization: