Dot products and convergence test always require global communication
-
No reason to pick one DISTRIBUTE over another
|
Vector updates require no communication
-
Really no reason to choose a particular DISTRIBUTE
|
Matrix-vector multiply does care where its data come from
-
In this case, same advantages/disadvantages as Jacobi iteration
|
The bottom line
-
(BLOCK,*) on high-latency machines or small problem sizes
-
(BLOCK,BLOCK) on low-latency machines
|