Implementation of Collective Communication
Summations, broadcasts, and other common patterns are well-studied operations
- Implementators should use the best algorithm known
- Don’t pay vendors who don’t
Typically, best asymptotic algorithms scale as log2 P
- Small machines may do better with other algorithms