To clearly identify the available parallelism in the block-diagonal-bordered Gauss-Seidel method, we define a block diagonal partition on the matrix, apply that partition to formula 3, and equate terms to identify available parallelism. We must also define a sub-partitioning of the last diagonal block to identify parallelism after multi-coloring.
First, we define a partitioning of the system of linear equations , where
the permutation matrix
orders the matrix into
block-diagonal-bordered form.
Equation 3 divides the
matrix into a diagonal component
, a strictly lower diagonal
portion of the matrix
, and a strictly upper diagonal portion
of the matrix
such that:
Derivation of the block-diagonal-bordered form of the D, L,
and U matrices is straightforward. Equation 3
requires the calculation of , which also is
simple to determine explicitly, because this matrix has
block-diagonal-lower-bordered form. Given these partitioned matrices,
it is relatively straightforward to identify available parallelism by
substituting the partitioned matrices and partitioned
and
vectors into the definition of the
Gauss-Seidel method and then performing the matrix multiplications. As
a result we obtain:
We can identify the parallelism in the block-diagonal-bordered portion
of the matrix by examining equation 7. If we assign each
partition i, , to a separate processor the
calculations of
are independent and require no
communications. Note that the vector
is
required for the calculations in each partition, and there is no
violation of the strict precedence rules in the Gauss-Seidel because
it is calculated in the last step. After calculating
in the first m partitions, the values of
must be calculated using the lower border and last
block. From the previous step, the values of
would be available on the processors where they were calculated, so
the values of
can be readily
calculated in parallel. Only
matrix
vector
products, calculated in
parallel, are involved in the communications phase. Furthermore, if we
assign
then the formulation of looks similar to equation 3: