Internally, all the iterator classes depend on the block member of the kernel range. In this section we show how to use this member directly for translation of overall constructs. The scheme given here effectively inlines the iterator members used in the previous translations. It also inlines the offset member of the Map class, in terms of lower level disp and step members.
To apply the translation scheme described in this section we need some compile-time knowledge about the level of the range involved. The scheme works recursively by expanding a overall construct in the source program in terms of a overall construct for a range one level lower. The procedure can (if desired) be continued recursively. After one or two stages we get down to a overall construct for a level-0 range. This base case has a different, simpler translation. For this scheme to be effective we must at least know in advance if the original parametric range has level 0 (ie, it already represents the base case). Ideally we should know the exact level, as a compile-time constant.
Before applying the translation proper, the input program should be normalized as discussed in the previous section. Rather than discuss translation of the three uses of Index individually, we will go straight to the combined summary form, in the style of figure 7.5. For a parametric range of level greater than zero, the translation summary is given in figure 7.6.
Figure 7.6: Summary of recursive translation
scheme for overall construct with level greater than 0.
The outer loop is a overall construct parametrized by x.ker(). The block member of x initialize variables describing the block selected by the current value of the kernel subscript. These variables correspond exactly with the fields of the Block component of LocBlocksIndex (and, of course, this is exactly how they are computed in the implementation of that class).
The offset operation is expanded in terms of disp and step operations, and an offset for a kernel range. If necessary the transformation can be applied recursively to eliminate the offset function altogether.
The rest of the translation closely follows that of the previous section.
If the parametric range has level 0 (it is a process dimension, or a subrange of a process dimension) the summary is given in figure 7.7.
Figure 7.7: Summary of translation
for overall construct of level 0.
One stage of recursion applied to the example of the previous section gives the translation in figure 7.8. If ranges x and y both have level 1 application of the rule for translating level 0 constructs then gives the translation in figure 7.9. Finally, figure 7.10 gives an optimized form on the assumption that the original ranges were level 1. We can remove the local conditional because every active process must contain an element of the kernel, and replace the value i1 initialized by local with d.crd(). Three other optimizations do not depend on the assumption of simplicity: additions of 0 are constant-folded away, x.ker().dim() is replaced with x.dim(), and the apg manipulations are removed, because the loop body contains no calls to collective operations.
Figure 7.8: Translation of example.
Pass 1, assuming ranges x and y
have level greater than zero.
Figure 7.9: Translation of example.
Pass 2, assuming ranges x and y have
level 1, so their kernels are level 0.
Figure 7.10: Translation of example.
Optimizations assuming ranges x and y are level 1.
Because level 1 ranges are an important case, figure 7.11 summarizes the translation of the overall construct for level 1 parametric ranges.
Figure 7.11: Summary of translation
for overall construct with level 1 x.