Conceptually, the process remains the same
-
Allocate memory, adjust indexing and loops, handle nonlocal data
-
New patterns require more elaborate methods to achieve this
|
SHADOW
-
Add extra space to allocation
-
Use that space for buffering of nonlocal data and adjusting indices
-
Ignore that space for adjusting loop bounds
-
Simplifies addressing, may avoid copying
|
GEN_BLOCK
-
Keep table of block bounds on each processor
-
Search table to find home of nonlocal elements, adjust indices and loops
-
Allows some load balancing with locality
|