Parallelism in O(N2) N Body Approach II
This type of performance should be compared with classic grid problems in d dimensions where overhead proportional to edge over area and goes like 1/n1/d
So that grain size in 3D local problems is “cube” of that needed in O(N2) problem
Of course reason that communication overhead is so small is that computation is exceptionally large!
- Every silver lining has a cloud
Note can “understand” 1/n dependence as write formula asSj Si Force on i = function(i and j) --
important that outer sum is j not i!
Each j is communicated to each node and is then used to increment force on n/2 (divided by two due to Newton’s law of action and reaction) particles in given node