The first question to answer is why use Java as a base language? Actually, the programming model embodied in HPJava is largely language independent. It can bound to other languages like C, C++ and Fortran. But Java is a convenient base language, especially for initial experiments, because it provides full object-orientation--convenient for describing complex distributed data--implemented in a relatively simple setting, conducive to implementation of source-to-source translators. It has been noted elsewhere that Java has various features suggesting it could be an attractive language for science and engineering [7].
With Java as base language, an obvious question is whether we can extend the language by simply adding packages, instead of changing the syntax. There are two problems with doing this for data-parallel programming.
Our baseline is HPF, and any package supporting parallel arrays as general as HPF is likely cumbersome to code with. The examples given earlier using the adJava interface illustrate this point. Our runtime system needs an (in principle) infinite series of class names
Array1dI, Array1cI, Array2ddI, Array2dcI, ...
to express the HPJava types
int [[]], int [[*]], int [[,]], int [[,*]] ...
as well as the corresponding series for char, float, and so on. To access an element of a distributed array in HPJava, one writes
a[i] = 3 ;
In the adJava interface, it must be written as,
a.dat()[a.pos(i)] = 3 ;
This is for simple subscripting. In passing in section 3.2 we noted how even more complex Fortran-90 style regular section construction appeared using the raw class library interface.
The second problems is that a Java program using a package like adJava in a direct, naive way will have very poor performance, because all the local address of the global array are expressed by functions such as pos. An optimization pass is needed to transform offset computation to a more intelligent style, as suggested in section 3.3. So if a preprocessor must do these optimizations anyway, it makes most sense to design a set of syntax to express the concepts of the programming model more naturally.