Jacobi Iteration: OpenMP Mapping
Minimize forking and synchronization overhead
- One parallel region at highest possible level
- Mark outermost possible loop for work sharing
Keep each processor working on the same data
- Consistent schedule for DO loops
- Trust underlying system not to migrate threads for no reason
Lay out data to be contiguous
- Column-major ordering in Fortran
- Therefore, make dimension of outermost work-shared loop the column