Gaussian Elimination: HPF Issues
Each step classically data-parallel
- Update: a(i,j)=a(i,j)-a(i,K)/a(K,K)*a(K,j)
- Pivot: MAXLOC reduction, vector assignments
Communication is structured but nonlocal
- Reduction tree and broadcast tree
Elements go inactive in later steps
- DISTRIBUTE CYCLIC for load balance
- ALIGN temporary with pivot row