next up previous
Next: The Parallel Algorithm Up: Sparse Matrix Solver Previous: Sparse Matrix Solver

The Hierarchical Data Structure

This block-diagonal-bordered sparse Choleski solver uses implicit hierarchical data structures based on vectors of C programming language structures to efficiently store and retrieve data for a symmetric block-diagonal-bordered sparse matrix. These data structures provide good cache coherence, because non-zero data values and row and column location indicators are stored in adjacent physical memory locations. This data structure is static, consequently, the locations of all fillin must be determined before memory is allocated for the data structures. There is no requirement for pivoting in Choleski factorization algorithms because the sparse matrices are by definition positive definite and numerically stable. Due to the static nature of the data structure, explicit pointers to subsequent data locations have been used in order to reduce indexing overhead. Row location indicators are explicitly stored as are pointers to subsequent values in columns that are required when updating values in the matrix. The use of additional memory in the data structures is traded for reduced indexing overhead. Modern distributed memory multi-processors are available with substantial amounts of random access memory at each node, so this research examines data structures that are designed to optimize processing speed at the cost of increased memory usage when compared to other compressed storage formats. We compare the memory requirements for these data structures to the memory requirements for the more conventional compressed data structures below.

The hierarchical data structure is composed of five separate parts that implicitly store a block-diagonal-bordered sparse matrix. The hierarchical nature of these structures store only non-zero values, especially in the borders where entire rows may be zero. Five separate C language structures are employed to store the data in a manner that can efficiently be accessed with minimal indexing overhead. Static vectors of each structure type are used to store the block-diagonal-bordered sparse matrix. Figure 10 graphically illustrates the hierarchical nature of the data structure, where the five separate C structure elements presented in that figure are:

  1. diagonal block identifiers,
  2. matrix diagonal elements,
  3. non-zero values in the lower triangular diagonal matrix blocks (arranged by rows),
  4. non-zero row identifiers in the lower border,
  5. non-zero values in the lower border (arranged by rows),

 
Figure 10: The Hierarchical Data Structure 

At the top of the hierarchical data structure is the information on the storage locations of independent diagonal blocks, and the lower borders. The next layer in the data structure hierarchy are the matrix diagonal and the identifiers of non-zero border rows. Data values on the original matrix diagonal are stored in the diagonal portion of the data structure, however, most of the remaining information stored along with each diagonal element are pointers so that data in related columns or rows can be rapidly accessed.

Data in the strictly lower triangular portion of the matrix is stored as sparse row vectors. This data storage scheme minimizes the effort to find non-zero --- pairs used to modify by consecutively storing values in lower triangular rows. However, column-oriented Choleski factorization algorithms require access to the next non-zero value in the same column, so pointers are used to permit direct access to those values without requiring searching for the data as is required in compressed storage formats. This data structure provides the benefits of a doubly linked data structure in order to minimize indexing overhead. The value corresponding to any diagonal element has pointers to the first non-zero element in the lower triangular row and to the first non-zero element in the lower border. This data structure trades memory utilization for speed by storing indicators to all non-zero column values. In addition, the combination of adjacent storage of non-zero row values and the explicit storage of column identifiers, greatly simplify the forward reduction and backward substitution steps.

Conventional compressed data formats require less storage than this data structure; however, additional memory has been traded for reduced indexing overhead. The compressed data format requires

bytes to store the A matrix implicitly. Likewise, the hierarchical data structure used in this implementation requires

bytes to store the same matrix implicitly.

 where: ¯

is the storage requirements in bytes for the compressed data structure.

is the storage requirements in bytes for the hierarchical data structure.

is the length if a floating point data type.

is the length if an integer data type.

is the number of non-zero values in the matrix.

n is the order of the matrix.

is the number of independent blocks.

is the number of non-zero row and column segments in the borders.

For double precision floating point or single precision complex representations of the actual data values and single word integer representations of all pointers, the hierarchical data structure takes approximately twice the data storage of the compressed data structure. By doubling the storage requirements, row data is available in sparse vectors for ready access when updating a column value and subsequent values in the column are directly addressable. When using conventional compressed data structures, indexing information is stored only on a single dimension and values along the other dimension must be found by searching through the data structures to find the next values to update. To find a value in a row or column, the average number of operations in the search will be one-half the average number of values in the row or column. Given that this costly search must be performed for nearly every non-zero value in the matrix, substantial indexing overhead is required when using the implicit compressed storage format. By using this data structure and doubling the storage, there is a significant decrease in indexing overhead.



next up previous
Next: The Parallel Algorithm Up: Sparse Matrix Solver Previous: Sparse Matrix Solver



David P. Koester
Sun Oct 22 15:40:25 EDT 1995