1 | The compiler decomposes the grid array onto processors by equal pieces |
2 | For each interior point, the algorithm must access the immediate neighbors according to a 5 point stencil, which may sometimes involve communication with other processors: |
3 | Then the algorithm can be implemented by code which first communicates the internal boundaries and then loops over the interior points on each processor. |