Next: Moment Matrix Fill Up: Implementation Previous: Parallel Setup

Precomputation

In the precomputation stage, there are six arrays to be computed. They are position vectors, basis functions, and the divergence of the basis functions at each integration point on the source patches and field patches, respectively. These arrays are three or four dimensional arrays. One of the dimensions of each of these arrays has the size of the maximum number of patches. For the problem we are interested in, the maximum number of the patches could be in the range from one thousand to several thousand. Each array is required by the filling algorithm in each node. We can simply let each node compute these arrays independently. However, it is too computationally expensive and inefficient, so instead these arrays will be computed in parallel. Let and denote the position arrays of the source patch and field patch, respectively. Also and respectively denote the basis function arrays on the source patches and the field patches, and and denote the divergences of the basis function on the source patches and the field patches, respectively where , or 7, , and , the number of the patches in the model.

To compute these arrays, we first divide the computational work almost evenly for each node. Each node works on its portion, and when it has completed its work the node broadcasts the result to the rest of nodes. The partition of the arrays is done along the largest dimension. Let , and , then these nodes whose index values are less than q compute components of the arrays, and the rest of the nodes compute components. That is, the node 0 computes , the node 1 computes , node computes , node computes , and node p computes .

The details of the algorithm are shown in Figure 4.5.

The CMMD library provides a global broadcast function. In a broadcast, a message is sent from a single source to all nodes. All nodes must take part in the broadcast. One node must signal its intention to send the broadcast message; all other nodes must signal their readiness to receive the message. From the hardware aspect, once a broadcast signal is up, the system begins checking the responses. When all the receiving nodes have received the broadcast message, the system considers the broadcast completed and returns the broadcast call. Keep in mind that all nodes receive data simultaneously and all receive the same amount of data in every broadcast. Therefore, every node must have sufficient buffer space to hold the broadcast message.



Next: Moment Matrix Fill Up: Implementation Previous: Parallel Setup


xshen@
Sat Dec 3 17:51:03 EST 1994