From kamala@spica.npac.syr.edu Mon Jun 20 00:01:50 1994 Date: Mon, 2 May 94 02:23:44 EDT From: Kamala Anupindi To: paulc@spica.npac.syr.edu Subject: report Hi Paul, Here's the report. ---------------------------------- In ETMSP, the routine IMPINT is the main driver for the implicit integration methods (using the trapezoidal method). The major computations involved are: 1. Obtain the dc solution for the next time step by calling subroutines DCNET1 or DCNETS, if DC links are present. 2. Predict the system states and voltages by calling subroutine PREDIC. 3. Calculate the system state derivatives for all dynamic devices by calling subroutines GXDOT, MXDOT, TCXDOT, SXDOT, RXDOT, TPXDOT, and DBXDOT. 4. Calculate the residues of system states based on trapezoidal rule. 5. Calculate the current injection vector, including all dynamic devices, and nonlinear loads. 6. Solve the network equations with the calculated current injection vector, by calling subroutines AUPDSC and ASOLSC. 7. Update the system state variables and the associated algebraic variables (Y-variables, internal currents, etc.), by calling subroutine UPSTAT. 8. After convergence of the calculations at current time, call again subroutine ASOLSC to compute all necessary network voltages and other quantities for output purpose. Steps 3 to 7 above constitute the VDHN iteration loop. The convergence test is performed at the end of step 5 for all current injections. Steps 3, 4 and 7 involve the differential equations formation and solution. A profile done on ETMSP clearly indicates that about 55% of the total time is being spent in the setting up and the solution of differential equations involved. Of this 55%, almost equal amount of time is being spent in setting up the equations and in solving them. GXDOT, MXDOT, TCXDOT, SXDOT, RXDOT, TPXDOT, and DBXDOT are the routines that are used to calculate the derivatives of the dynamic devices for use in Trapezoidal methods. DGEN, DEMOTL, DETSCL, DESVC, DERAN, DESTP and DESDB are the corresponding routines used to calculate the derivatives for use in Runge Kutta method. Of all these routines, GXDOT, which calculates the derivatives of detailed and classical machines for use in trapezoidal method, takes almost 15-20% of the total time, depending on the number of generators present. Also, it is the most time consuming routine in ETMSP as a whole. Part of GXDOT routine's code reads as follows: ( Line # 85 ) DO 850 K=1,IPNODE C KN=3*K NLF=NOLD(K) VOLTK=VOLT(K) GTV=CABS(VOLTK) AT=ATAN2(VIM,VR) C L=0 100 L=L+1 N=N+1 JG=JCONTA(1,N) JYG=MYVAR(1,N) more program deleted ( Line # 1401 ) C IF (L.LT.KTHGE(K)) GO TO 100 850 CONTINUE Here, IPNODE is the last detailed synchronous machine bus, and N is the sequential number of the detailed generator. KTHGE is the cross reference array for generator internal sequential number and bus number, i.e KTHGE(K) = N where K is the program bus number and N is the internal sequential number. From the code given above, it can be seen that it is not written in a format that is compatible for parallel programming. Rewriting this code with minor changes (as given below), it can be made completely parallel. DO 850 K=1,IPNODE C KN=3*K NLF=NOLD(K) VOLTK=VOLT(K) GTV=CABS(VOLTK) AT=ATAN2(VIM,VR) C 100 DO 101 L = 1 , KTHGE(K) N=N+1 JG=JCONTA(1,N) JYG=MYVAR(1,N) (part of program deleted) C 101 CONTINUE 850 CONTINUE Since all processors have all the data to begin with, each one of them has the array KTHGE(K) with them. Hence, the above described code is now embarassingly parallel at two levels: one at the IPNODE level and one at the N level, i.e., one at the synchronous machine bus level and one at the detailed generator level. Similar changes were made in other routines relevant to the setting up of generator equations. On parallelizing the above described routine, the following problems were encountered: * ETMSP is an interactive program. Hence, when running on SP1, as many windows as the number of processor nodes allotted, pop up, each with an interactive session for ETMSP. This had to be changed to set values of the input in the ETMSP itself, for the purpose of preliminary testing. * Once each processor node completes its job, the arrays involved in the computation need to be concatenated. The number of such arrays/variables for this routine (and for other routines too) runs into hundreds. As of now, attempts are being made at 'crude concatenation', i.e. for each variable, explicit concatenation statements need to be added. This will lead to enormous communication overhead, which is bound to slow down the program. (Paul, do we need to make a suggestion here for improvement or not?)... Is this detailed enough? Would you like anything specific to be added?? Kamala