next up previous
Next: Convergence Rate Up: Parallel Sparse Gauss-Seidel Previous: Examining Speedup

Analyzing Algorithm Component Performance

We next present a detailed analysis of the performance of the component parts of the parallel block-diagonal-bordered Gauss-Seidel algorithm. We present graphs that show the time in milliseconds to perform each of the component operations of the algorithm:

  1. calculate in the diagonal blocks,
  2. update using values in the lower border,
  3. calculate using the values of and the last diagonal block ,
  4. perform a convergence check.
Detailed parallel algorithm analysis will demonstrate that the preprocessing phase can effectively load balance the matrix for as many as sixteen processors for all networks examined and can effectively load balance the matrix for as many as 32 processors for certain classes of data sets. We present graphs that illustrate algorithm component performance in figure 7.16. Each graph has four curves that show parallel Gauss-Seidel component performance for a single iteration.

 
Figure 7.16: Algorithm Component Timing Data --- Double Precision Gauss-Seidel 

These figures corroborate the results of the previous two sections that identified load imbalance for the two planning networks --- EPRI6K and NiMo-PLANS. The graphs with performance data from the BCSPWR09, BCSPWR10, and NiMo-OPS power systems matrices show good load balancing for the diagonal blocks and lower border; however, the graphs for the EPRI6K and NiMo-PLANS data show degraded performance for 32 processors. Load imbalance is evident when empirical performance data for calculating in the diagonal blocks does not yield a straight line. Load imbalance is also the likely cause that the slope of individual curve splines, both for updating in the last diagonal block and for performing convergence checks, do not have constant slope. Previous graphs showed little effect by increasing the computation-to-calculations granularity, so any degraded performance would be due to sources of overhead other than communications overhead.

The times to calculate in the multi-colored last diagonal block are always the least of the four operations for all five power systems networks, and for all but the planning matrices, the time to solve for is monotonically decreasing. Communications overhead, if it exists, would occur in this algorithm component as the number of processors increases.

We draw the following conclusions from this detailed examination of the parallel Gauss-Seidel algorithm components:



next up previous
Next: Convergence Rate Up: Parallel Sparse Gauss-Seidel Previous: Examining Speedup



David P. Koester
Sun Oct 22 17:27:14 EDT 1995