Data descriptor and global data

Next: Array section and type Up: Global variables Previous: Global variables

Data descriptor and global data

Actually, the concept of data descriptor is not entirely new. It exists in Java language itself. The field length in Java array reflects that Java array is accessed through a data descriptor.

On a single processor, an array variable can be labeled by a simple value like memory addresses and an int value as length. On a multi-processor, a more complicated structure is needed to label a distributed array. We also call it data descriptor.

The data descriptor portrays where the data is created, and how are they distributed. A logical structure of a descriptor is shown in figure 2.

Figure 2: Descriptor

New syntax were added in HPJava to define data with descriptors.

  on(p)
    int # s = new int #;

creates a global scalar on the current executing process group. In the statement, s is a data descriptor handle, in HPJava term, a global scalar reference. And the scalar is of an integer value. Global scalar references can be defined for each primitive type and class type in Java.

The symbol # in the right hand side of the assignment indicates a data descriptor is allocated as the scalar is created.

Also it will be used to access this int value, as in the following,

  on(p) {
    int # s = new int #;
    # s = 100;
  }

Note, the value of s is duplicated on each process in the current executing processes. Duplicated variables are different from replicated local variables. The descriptors they have can be used to keep their value identical on each process during the program execution.

The group inside a descriptor is called data owner group, it defines where the global variable belongs.

  on(p)
    int # s = new int # on q;

will set data owner field in the descriptor as group q, instead of the default p.

When defining a global array, it is not necessary to allocate a data descriptor for each array element, so the syntax to define a global array is not derived directly from the one for scalar.

  on(p) 
    float [[ ]] a = new float [[100]];

will create a global array of size 100 on group p. Here a is a descriptor handle, which describes an one dimension float type array. Its distribution format is a collapsed one, with its elements duplicated in that process dimension. We call it a collapsed dimension. In HPJava term, a is also called a global or distributed array reference.

A distributed array can also be defined with different kinds of ranges we introduced before.

  on(q) 
    float [[#]] b = new float [[x]];

will create a global array with range x on group q. Again, b is a descriptor handle, which describes an one dimension float type array of size 100, distributed with block range.

When defining a global array, # is used to mark a non-collapsed dimension.

The accessing pattern of a global array element is not the same as a global scalar reference, neither exactly same as a local array element. Since global arrays may have position information in their dimensions, we may need location references as their indexes when their dimensions are not collapsed.

  Location i=x|3;
  at(i)
    b[i]=3;

Here the forth element of array a is assigned to 3. We will leave at construct and how to access array elements in section 2.3, and look at simpler example here.

When a global array is defined with a collapsed dimension, accessing its element is as usual,

  for(int i=0; i<100; i++)
    b[i]=i;

will assign the loop index to each corresponding element in the array.

When defining a multi-dimension global array, one descriptor can describe a rectangular array of any dimensions,

  Range x = new BlockRange(100, p.dim(0)) ;  
  Range y = new CyclicRange(100, p.dim(1)) ;
  float [[#,#]] c = new float [[x, y]];

will create a two-dimension global array, with the first dimension block distributed and the second cyclic distributed. c is a global array reference, its element can be accessed by putting a single bracket with two location references inside.

The global array introduced here is Fortran-style multi-dimension arrays rather than C-like array-of-arrays, hence it can be clearly shown that which dimensions the descriptor is describing.

The array-of-arrays in Java is still useful. For example, one can define a distributed array of local arrays.

Next: Array section and type Up: Global variables Previous: Global variables

Guansong Zhang
Mon Feb 23 15:47:12 EST 1998