next up previous
Next: Parallelism in Block-Diagonal-Bordered Up: Available Parallelism Previous: Available Parallelism in

Available Parallelism in Block-Diagonal-Bordered Form Matrices for Iterative Methods

All parallelism in the Gauss-Seidel algorithm is derived from within the actual interconnection relationships between elements in the matrix. Ordering sparse matrices into block-diagonal-bordered form can offer substantial opportunity for parallelism, because the values of in entire sparse matrix partitions can be calculated in parallel without requiring communications. Because the sparse matrix is a single system of equations, all equations sharing off-diagonal variables are dependent. Dependencies within the linear system requires data movement from mutually independent partitions to those equations that couple the linear system. After we derive the formulation of the Gauss-Seidel algorithm for a block-diagonal-bordered matrix, the optimum data/processor assignments for an efficient parallel implementation are straightforward.

While much of the parallelism in this algorithm is made clearly visible as a result of the block-diagonal-bordered ordering of the sparse matrix, further ordering of the last diagonal block is required to provide parallelism in what would otherwise be a purely sequential portion of the algorithm. The last diagonal block represents the interconnection structure within the equations that couple the partitions in the block-diagonal portion of the matrix. These equations are rather sparse, often with substantially fewer off-diagonal matrix elements (graph edges) than diagonal matrix elements (graph nodes). Consequently, it is rather simple to color the graph representing this portion of the matrix. Separate graph colors represent rows where can be calculated in parallel, because within a color, no two nodes have any adjacent edges, and there is no precedence when performing the calculations. For the parallel Gauss-Seidel algorithm, a synchronization barrier is required between colors to ensure that all new values are distributed to the processors so that the strict precedence relation in the calculations are maintained.





David P. Koester
Sun Oct 22 17:27:14 EDT 1995