In horizontal
-
Clump so that 4 points per task
-
Efficiency: communicate with 4 neighbors only
|
In vertical, clump all points in column
-
Performance: avoid communication
-
Modularity: Reuse physics modules
|
Resulting algorithm "reasonably scalable"
-
(Nx.Ny)/4 : at least 1250 tasks
|