next up previous contents
Next: Empirical Results --- Up: Empirical Results Previous: Empirical Results

Empirical Results --- Ordering Power Systems Network Matrices into Block-Diagonal-Bordered Form

Performance of our parallel block-diagonal-bordered LU and Choleski solvers will be examined with five separate power systems network matrices:

Matrices BCSPWR09 and BCSPWR10 are from the Boeing Harwell series and represent electrical power system networks from the Western and Eastern US respectively. The EPRI-6K matrix is distributed with the Extended Transient-Midterm Stability Program (ETMSP) from EPRI. Matrices NiMo-OPS and NiMo-PLANS have been made available by the Niagara Mohawk Power Corporation, Syracuse, NY.

To illustrate the performance of the graph partitioning algorithm, we will present pseudo-images that illustrate the sparsity of the matrices that represent the power systems networks. These pseudo-images illustrate the locations of the non-zero values in the matrices, both the original non-zero values and those that would become non-zero due to fillin during factorization. In the following pseudo images, original non-zero values are represented as black and fillin values are represented by a lighter grey color. A bounding box has been placed around the sparse matrix.

A detailed ordering analysis of graph partitioning is presented here for the BCSPWR09 power systems network data to illustrate the ability of the node-tearing ordering algorithm described in section 6.1. In order to provide a baseline with which to illustrate the performance of the node-tearing algorithm, we provide a representation of the original matrix in figure 26 and a representation of the sparse matrix after ordering with the minimum degree algorithm in figure 27. The original matrix has no fillin and is presented with the graph node identifiers as supplied in the Boeing-Harwell data distribution --- without ordering. The minimum degree ordered matrix is the most sparse in the upper left-hand corner, while the matrix is less sparse in the lower right-hand corner. When factoring this matrix, the number of zero values that become non-zero while factoring the matrix, is 2,168.

 


: BCSPWR09 --- Original Matrix  

 


: BCSPWR09 --- Minimum Degree Ordering  

Our parallel block-diagonal-bordered direct solver algorithms require that the power systems network matrix be ordered into block-diagonal-bordered form with uniformly distributed workload at each of the processors. A single specified input parameter, the maximum partition size, defines the shape of the matrix after ordering by the node-tearing algorithm. Examples of applying the node-tearing algorithm to the BCSPWR09 matrix are presented in figures 28 through 31, respectively for maximum diagonal block sizes of 16, 32, 64, and 96 nodes. Statistics are presented in table 2 for the four matrix orderings. In this table, is the number of rows/columns in the borders and last diagonal block of the ordered matrix and is the number of fillin. This table shows that the ordering with maximum partition size of 32 has the least fillin, the fewest total operations, the largest percentage of operations in the mutually independent matrix partitions, and the best parallel direct linear solver performance.

 


: BCSPWR09 --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 16 --- Load Balanced for 8 Processors  

 


: BCSPWR09 --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 32 --- Load Balanced for 8 Processors  

 


: BCSPWR09 --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 64 --- Load Balanced for 8 Processors  

 


: BCSPWR09 --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 96 --- Load Balanced for 8 Processors  

 
Table 2: BCSPWR09 --- LU and Choleski Factorization Ordering Statistics  

Figures 28 through 31 illustrate that the size of the borders and last diagonal block can be manipulated by varying the value of the single input parameter to the partitioning algorithm, the maximum partition size. The number of rows/columns in the borders and last diagonal block of these ordered matrices vary from 277 to 131 for maximum partition size of 16 and 96 respectively. Each of these four figures has been load-balanced for eight processors. Figure 29 includes additional markings to illustrate how this matrix would be distributed to the eight processors --- P1 through P8. Load-balancing is a function of the number of operations and not the number of columns assigned to a processor. The load balancing step is simply another permutation of the matrix that keeps rows/columns within partitions together in the same order. As the matrix is load-balanced for various numbers of processors, there is no change in the number of fillin or the total number of operations.

Figure 32 illustrates the relationship between maximum partition size and size of the borders and last diagonal block for each of the five power systems networks used in this analysis. The partitioning results for the BCSPWR09 network is very similar to the data for the Niagara Mohawk operations data. These matrices are similar in size and have similar numbers of edges per node. The larger matrices have significantly larger numbers of rows in the last diagonal block, and there is significantly larger variation between the number of rows in the last diagonal block for a maximum partition size of sixteen than for larger maximum partition sizes. This is empirical evidence that there are variations between data from operational analysis networks and larger planning networks. Additional evidence of differences are discussed below, both when we present example of the ordering for these matrices and when we discuss the performance of the parallel direct linear solvers.

 
Figure 32: Last Diagonal Block Size for Power Systems Matrices after Partitioning with the Node-Tearing Algorithm  

Note that the maximum size of the diagonal blocks is inversely related to the size of the last diagonal block in figure 32. This is intuitive, because as diagonal matrix blocks are permitted to grow larger, multiple smaller blocks can be incorporated into a single block. Not only will the two blocks be consolidated into the single block, but in addition, any elements in the coupling equations that are unique to those network partitions could also be moved into the larger block. Another interesting point with the relationship between maximum size of the diagonal block and the size of the last block, is that the percentage of non-zeros and fillin in the last diagonal block increases significantly as the size of the last block decreases. The empirical performance data for the parallel solvers show that the best parallel performance is closely correlated with the minimum numbers of operations. In tables 3 through 6, we present summary statistics for the remaining power systems networks used in this analysis. In each table, the maximum partition size that yielded the best parallel performance is marked.

 
Table 3: BCSPWR10 --- LU and Choleski Factorization Ordering Statistics  

 
Table 4: EPRI6K --- LU and Choleski Factorization Ordering Statistics  

 
Table 5: NiMo-OPS --- LU and Choleski Factorization Ordering Statistics  

 
Table 6: NiMo-PLANS --- LU and Choleski Factorization Ordering Statistics  

In figures 33 through 40, we provide an accompanying visual reference to the partitioning performance data presented in tables 3 through 6. We present two figures for each power systems network: the original matrix before ordering, and the matrix after partitioning and load-balancing for 8 processors. Partitioned graphs presented here have values of the maximum partition size that yielded the best empirical parallel block-diagonal-bordered direct linear solver performance.

 


: BCSPWR10 --- Original Matrix  

 


: BCSPWR10 --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 32 --- Load Balanced for 8 Processors  

 


: EPRI6K --- Original Matrix  

 


: EPRI6K --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 16 --- Load Balanced for 8 Processors  

 


: NiMo-OPS --- Original Matrix  

 


: NiMo-OPS --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 32 --- Load Balanced for 8 Processors  

 


: NiMo-PLANS --- Original Matrix  

 


: NiMo-PLANS --- Block-Diagonal-Bordered Form --- Maximum Partition Size = 32 --- Load Balanced for 8 Processors  

When examining the unordered matrices, there appears to be significant differences between the power systems networks from the Niagara Mohawk Power Corporation --- they have some block structure, while the Boeing-Harwell matrices and the EPRI matrix appear that they have been ordered with a minimum degree ordering. This can be seen in figures 26 and 27. Except for the fillin, the general pattern appears the same for the original matrix and the minimum degree ordered matrix. It is important to note that the block-diagonal-bordered matrices for the BCSPWR09 and NiMo-OPS matrices and the EPRI6K and NiMo-PLANS matrices look similar. The BCSPWR09 and NiMo-OPS matrices are for operational networks that are homogeneous and have very similar voltage distributions throughout. Meanwhile, the EPRI6K and NiMo-PLANS matrices are from planning applications, and one subsection of these networks includes some lower voltage distribution lines. This matrix has enhanced detail in the local area, with less detail in areas around the power utility's area of interest. This causes additional rows/columns in the borders and the last diagonal blocks, but our parallel block-diagonal-bordered direct solvers appear to have little difficult with efficiently solving these matrices. The small, highly connected graph section can be seen at the lower right-hand corner of the matrix in figures 36 and 40.



next up previous contents
Next: Empirical Results --- Up: Empirical Results Previous: Empirical Results



David P. Koester
Sun Oct 22 15:31:10 EDT 1995