The parallel block-diagonal-bordered Choleski algorithm, presented in this paper, addresses the most difficult of these application to implement on multi-processors. Load-flow has the smallest matrices and the fewest calculations due to symmetry and lack of requirements for pivoting to ensure numerical stability. Load-flow calculations are included in decoupled solutions to transient stability differential-algebraic equations. Parallel Choleski algorithms have the same amount of interprocessor communications overhead as parallel LU algorithms; meanwhile, there are twice as many floating-point operations available in LU factorization. This means that relative speedup, or the improvement in performance when a problem is solved on multiple processors, will be better for LU factorization because there the additional calculations attenuate the many sources of overhead in the parallel algorithm, especially communications overhead.
The parallel block-diagonal-bordered LU algorithm, also presented later in this paper, would be appropriate to use for solving the position symmetric matrix that occurs in transient stability analysis, when the differential equations representing the generator dynamics are solved decoupled from the linear network equations. The system of linear equations in this application is similar to that encountered in load flow, but with the values at generator buses modified to represent the effects of the dynamic state of the generators. This matrix is position symmetric, with the same structure as a load flow matrix, but values in symmetric locations may not be equal. Decoupled transient stability analysis encounters the same small matrix sizes of load-flow analysis, but there are nearly twice the number of calculations for double precision LU factorization and six times the number of calculations for complex-variate LU factorization.