1 |
All new features represent useful information to the compiler
-
They do not change the meaning of the program if properly used
|
2 |
REDUCTION
-
Choose an efficient order of evaluation for the reduction tree
-
For scalar reductions, keep a local sum on each node and have a global combining phase at the end
-
Alternate implementation: critical region
-
For vector reductions, use same algorithm as XXX_SCATTER
|
3 |
ON HOME
-
Base the loop partitioning on the HOME expression
-
Invert subscripting function to derive loop bounds
-
Does not affect where communication/synchronization can be placed, but may change what must be communicated
-
Warning: You can outsmart the compiler this way‹to your detriment
|