The ordering imposed by the permutation matrix , includes
multi-coloring-based ordering of the last diagonal block that produces sub-partitions
with parallelism, We define the sub-partitioning as:
where are diagonal blocks and c is the number of
colors. After forming
and
, it is
straight forward to prove that:
Calculating in each sub-partition of
does not require values of
within the sub-partition, so we can calculate the
individual values within
in any order and
distribute these calculations to separate processors without concern
for precedence. In order to maintain the strict precedence in the
Gauss-Seidel algorithm, the values of
calculated in each step must be broadcast to all processors, and
processing cannot proceed for any processor until it receives the new
values of
from all other processors.
If the block-diagonal-bordered matrix partitions ,
, and
(
) are
assigned to the same processor, then there are no communications
until
is calculated. At that time, only
matrix
vector
products are sent to the processors that hold
the appropriate data in the last diagonal block. This processor/data
assignment to processors is defined by multi-coloring only the last
diagonal block.
Figure 2 describes the calculation steps in the parallel Gauss-Seidel for a block-diagonal-bordered sparse matrix. This figure depicts four diagonal blocks, and data/processor assignments (P1, P2, P3, and P4) are listed for the data block. Figure 3 illustrates the data/processor assignments in the last diagonal block.
Figure 2: Block-Bordered-Diagonal Form Gauss-Seidel Method
Figure 3: Multi-Colored Gauss-Seidel Method for the Last Diagonal Block