next up previous
Next: Available Parallelism in Up: Available Parallelism Previous: Available Parallelism

Available Parallelism in Block-Diagonal-Bordered Form Matrices for Direct Methods

While an elimination graph offers intuition into the available parallelism in block-diagonal-bordered sparse matrices for direct methods, it is possible to examine the theoretical mathematics of matrix partitioning to clearly identify available parallelism in this sparse matrix form. By partitioning the block-diagonal-bordered matrix into:

and calculating the Shur complement [12], it is possible to identify available parallelism for direct methods by proving a theorem that states the LU factors of a block-diagonal-bordered matrix are also in block-diagonal-bordered form. A supporting lemma stating that the LU factors of a block-diagonal matrix are also block-diagonal form is required to complete the proof of the theorem. A similar version of this derivation can be used to identify the parallelism in Choleski factorization.

Define a partition of as

 

 where:¯

, , and are of size

and are of size

and are of size

, , and are of size .

The Shur complement of the partitioned matrices in equation gif can be calculated by simply performing the matrix multiplication on the partitions which yields:

 

By equating blocks in equation gif, we can easily identify how to solve for the partitions:

 

Before we can proceed and prove the theorem that the factors of a block-diagonal-bordered (BDB) position symmetric sparse matrix are also in block-diagonal-bordered form, we must define additional matrix partitions in the desired form and prove a Lemma that the factors of a block-diagonal (BD) matrix are also in block-diagonal form. At this point, we must define additional partitions of that represent the block-diagonal-bordered nature of the original matrix:

 

 

 

 

 

 

 

Lemma --- The factors of a block-diagonal matrix are also in block-diagonal form

Proof:

Let:

 

By applying the Shur complement to equation gif, we obtain:

and

If is non-singular and has a numerical factor, then and must exist and be non-zero: thus

This lemma can be applied recursively to a block-diagonal matrix with any number of diagonal blocks to prove that the factorization of a block-diagonal matrix preserves the block structure.

Theorem --- The factors of a block-diagonal-bordered matrix are also in block-diagonal-bordered form. To restate this theorem, we must show that .

Proof:

First the matrix partitions and have simply been further partitioned to match the sizes of the diagonal blocks. Meanwhile, the matrix partition has been left unchanged. In the lemma, we proved that the factors of are block-diagonal if is block-diagonal. Consequently, .

As a result of this theorem, it is relatively straight forward to identify available parallelism by simply performing the matrix multiplication in a manner similar to the Shur complement. As a result we obtain:

  1. Diagonal Blocks: ,
  2. Lower Border: ,
  3. Upper Border: ,
  4. Last Block:

    .

If the matrix blocks , , and () are assigned to the same processor, then there are no communications until the last block is factored. At that time, only sums of sparse matrix sparse matrix products are sent to the processors that hold the appropriate data in the last block.

This derivation identifies the parallelism in the LU factorization step of a block-bordered-diagonal sparse matrix. The parallelism in the forward reduction and backward substitution steps also benefits from the aforementioned data/processor distribution. By assigning data in a matrix block and its associated border sections to the same processor, no communications would be required in the forward reduction phase until the last block of the factored matrix, , is updated by the product of a dense vector partition the sparse matrix (). No communications is required in the backward substitution phase after the values of are distributed to all processors holding the matrix blocks and ().

Figure gif illustrates both the LU factorization steps and the reduction/substitution steps for a block-diagonal-bordered sparse matrix. In this figure, the strictly lower diagonal portion of the matrix is , and the strictly upper diagonal portion of the matrix is . This figure depicts four diagonal blocks, and processor assignments (P1, P2, P3, and P4) are listed with the data block. This figure would represent the block-diagonal-bordered form matrix and data distribution for the data represented in figures gif and gif.

 
Figure: Block Bordered Diagonal Form Sparse Matrix Solution Steps 



next up previous
Next: Available Parallelism in Up: Available Parallelism Previous: Available Parallelism



David P. Koester
Sun Oct 22 17:27:14 EDT 1995