This block-diagonal-bordered sparse Gauss-Seidel method uses implicit
data structures based on vectors of C programming language structures
to store and retrieve data efficiently within the sparse matrix.
These data structures provide good cache hit access, because non-zero
data values and column location indicators are stored in adjacent
physical memory locations. The hierarchical data structure is
composed of six separate parts that implicitly store the
block-diagonal-bordered sparse matrix and the last block. The
hierarchical nature of these structures store only non-zero values,
especially in the lower border where entire rows may be zero. Six
separate C language structures are employed to store the data in a
manner that can efficiently be accessed with minimal indexing
overhead. Static vectors of each structure type are used to store the
block-diagonal-bordered sparse matrix. Figure
graphically illustrates the relationships within the data structure,
where the distinct C structure elements presented in that figure are:
As illustrated in the figure, the block-diagonal structure, the
border-row structure, and the last-block-diagonal structure contain
pointers to the sparse row vectors. The second values in the two
diagonal pointers are the values of , while the second value
in the border-row structure is the destination processor,
, for the
vector
vector product from this border row used in
calculating values in the last diagonal block.
Figure: The Hierarchical Data Structure for Parallel Gauss-Seidel Methods
The hierarchical data structure for parallel Gauss-Seidel actually has three separate data structures that have pointers to corresponding data structures to store off-diagonal non-zero row vector elements. The first data structure of pointers to off-diagonal non-zero matrix elements represents the elements on the diagonal within blocks, excluding the last diagonal block. The last diagonal block is handled separately by the third set of data structures. The second data structure of pointers efficiently stores the non-zero values in the lower border. Because entire lower border rows or upper border columns may be sparse in a block, two layers are required to store this data in an efficient manner.
The data structures for the sparse block-diagonal-bordered Gauss-Seidel algorithm are much simpler than the data structures for the direct methods. Data structures for the parallel Gauss-Seidel method have one less layer than the data structures for direct methods --- it is possible to eliminate the block identifiers without causing additional effort to search for values. In addition, separate data storage for the upper borders has been eliminated. These values are entered into sparse column vectors that are denoted by the block identifiers. The extra loops for blocks and the upper border must be eliminated in order to reduce indexing overhead that would occur in the parallel implementation but not in the sequential implementation. During algorithm development, we discovered that this indexing overhead significantly degraded parallel algorithm performance, so enhanced data structures were developed that required a minimum of for loop instantiations.