To clearly identify the available parallelism in the
block-diagonal-bordered Gauss-Seidel method, we define a block
diagonal partition on the matrix, apply that partition to
formula , and equate terms to identify available
parallelism. We must also define a sub-partitioning of the last
diagonal block to identify parallelism after multi-coloring.
First, we define a partitioning of the system of linear equations , where
the permutation matrix
orders the matrix into
block-diagonal-bordered form.
Equation divides the
matrix into a diagonal component
, a strictly lower diagonal
portion of the matrix
, and a strictly upper diagonal portion
of the matrix
such that:
Derivation of the block-diagonal-bordered form of the D,
L, and U matrices is straightforward. Equation
requires the calculation of
, which is
simple to determine explicitly, because this matrix has
block-diagonal-lower-bordered form. The diagonals in
are of the form:
and the only other non-zero terms are in the last row. These values are of the form:
Given these partitioned matrices, it is relatively straightforward to
identify available parallelism by substituting the partitioned
matrices and partitioned and
vectors into
the definition of the Gauss-Seidel method and then performing the
matrix multiplications. As a result we obtain:
We can identify the parallelism in the block-diagonal-bordered portion
of the matrix by examining equation . If we assign each
partition i,
, to a separate processor the
calculations of
are independent and require no
communications. Note that the vector
is
required for the calculations in each partition, and there is no
violation of the strict precedence rules in the Gauss-Seidel because
it is calculated in the previous step. After calculating
in the first m partitions, the values of
must be calculated using the lower border and last
block. From the previous step, the values of
would be available on the processors where they were calculated, so
the values of
can be readily
calculated in parallel. Only matrix
vector products, calculated in
parallel, are involved in the communications phase. Furthermore, if we
assign
then the formulation of looks similar to equation
: