Having fetched all needed data, can update all particles in processors with high efficiency seen in O(N2) problem.
-
Namely communicated data is re-used many times as needed by many particles in processor
-
In practice have some 10,000 or more particles per processor
|
In astrophysics, also need to construct tree and rebalance load at essentially each time step
|
There is a straightforward parallel tree building algorithm
|
These methods outperform O(N2) when more than a few thousand particles (unless problem very inhomogeneous)
|