LU factorization is a variant of Gaussian elimination, and has numerous variants that depend on the order of calculations in addition to other implementation factors. There any numerous algorithms for LU factorization of dense matrices, that have three nested for loops around the statement:
In this statement, the indices run:
The most significant aspect of parallel sparse LU factorization is that the sparsity structure can be exploited to offer more parallelism than is available with dense matrix solvers. Parallelism in dense matrix factorization is achieved by distributing the data in a manner that the calculations in one of the for loops in equation 4 can be performed in parallel. Due to precedence relationships in the algorithm, this is generally the inner most for loop. Sparse factorization algorithms have inadequate calculations using the inner most index for efficient parallelism; however, sparse matrices have additional parallelism as a result of the nature of the data and the precedence rules governing the order of calculations. Instead of just parallelizing the inner most for loop as in parallel dense matrix factorization, entire independent portions of a sparse matrix can be factored in parallel --- especially when the sparse matrix has been ordered into block-diagonal-bordered form.