While an elimination graph offers intuition into the available parallelism in block-diagonal-bordered sparse matrices for direct methods, it is possible to examine the theoretical mathematics of matrix partitioning to clearly identify available parallelism in this sparse matrix form. By partitioning the block-diagonal-bordered matrix into:
Define a partition of as
where:¯The Shur complement of the partitioned matrices in equation
,
, and
are of size
![]()
and
are of size
![]()
and
are of size
![]()
,
, and
are of size
.
By equating blocks in equation , we can easily identify
how to solve for the partitions:
Before we can proceed and prove the theorem that the
factors of a block-diagonal-bordered (BDB) position symmetric sparse matrix are also
in block-diagonal-bordered form, we must define additional matrix
partitions in the desired form and prove a Lemma that the
factors of a block-diagonal (BD) matrix are also in
block-diagonal form. At this point, we must define additional partitions of
that represent the block-diagonal-bordered nature of the
original
matrix:
Lemma --- The factors of a block-diagonal
matrix are also in block-diagonal form
Proof:
Let:
By applying the Shur complement to equation , we obtain:
and
If is non-singular and has a numerical
factor, then
and
must exist and be non-zero: thus
This lemma can be applied recursively to a block-diagonal matrix with any number of diagonal blocks to prove that the factorization of a block-diagonal matrix preserves the block structure.
Theorem --- The factors of a
block-diagonal-bordered matrix are also in block-diagonal-bordered
form. To restate this theorem, we must show that
.
Proof:
First the matrix partitions and
have simply been further partitioned to match the
sizes of the diagonal blocks. Meanwhile, the matrix partition
has been left unchanged. In the lemma, we proved that
the factors of
are block-diagonal if
is block-diagonal. Consequently,
.
As a result of this theorem, it is relatively straight forward to identify available parallelism by simply performing the matrix multiplication in a manner similar to the Shur complement. As a result we obtain:
.
This derivation identifies the parallelism in the LU factorization
step of a block-bordered-diagonal sparse matrix. The parallelism in
the forward reduction and backward substitution steps also benefits
from the aforementioned data/processor distribution. By assigning
data in a matrix block and its associated border sections to the same
processor, no communications would be required in the forward
reduction phase until the last block of the factored matrix, , is updated by the product of a dense vector partition
the sparse matrix
(
). No communications is required in the backward substitution phase
after the values of
are distributed to all processors
holding the matrix blocks
and
(
).
Figure illustrates both the LU factorization steps
and the reduction/substitution steps for a block-diagonal-bordered
sparse matrix. In this figure, the strictly lower diagonal portion of
the matrix is
, and the strictly upper diagonal portion of
the matrix is
. This figure depicts four diagonal blocks,
and processor assignments (P1, P2, P3, and P4) are listed with the
data block. This figure would represent the block-diagonal-bordered
form matrix and data distribution for the data represented in
figures
and
.
Figure: Block Bordered Diagonal Form Sparse Matrix Solution Steps