JSR HTML Template

JSR HTML Template

Identification | Request | Contributions | Additional Information

General Instructions

This template has been designed to be easily filled out using an HTML editor. Please complete all sections. Don't forget to give the proposed specification a name.

E-mail the completed JSR to: jsr-submit@sun.com. Don't forget to include the name of the JSR in the subject line.

As per Section 1 of the Java Community Process, JSRs will only be accepted from Participants (and each Participant can only have 3 JSRs active at the same time).

JSR - Java Array package

Section 1. Identification

Submitting Participant: International Business Machines Corporation

Name of Contact Person: Jose E. Moreira

E-Mail Address: jmoreira@us.ibm.com

Telephone Number: 1-914-945-3987

Fax Number: 1-914-945-4425

(Optional) List of other Participants who endorse this JSR:

Java Grande Forum

Section 2: Request

2.1 Please describe the proposed Specification:

Multidimensional arrays are n-dimensional rectangular collections of elements. An array is characterized by its rank (number of dimensions or axes), its elemental data type (all elements of an array are of the same type), and its shape (the extents along its axes).

Elements of an array are identified by their indices along each axis. Let a d-dimensional array A of elemental type T have extent n_j along its j-th axis, j = 0,...,d-1. Then, a valid index i_j along the j-th axis must be greater than or equal to zero and less than n_j. An attempt to reference an element A[i₀,i₁,...,i_d-1] with any invalid index i_j causes an ArrayIndexOutOfBoundsException to be thrown.

Elements of an array are logically ordered with respect to each other according to the following definition. An element A[i₀,i₁,...,i_d-1] of a d-dimensional array A follows an element A[j₀,j₁,...,j_d-1] of the same array if and only if there exists a k greater than or equal to zero and less than d such that i_l=j_l for all l<k and i_k>j_k. In usual nomenclature, this corresponds to row-major (C-style) ordering of the elements.

We propose the development of standard Java classes which implement multidimensional rectangular arrays. The rank and type of an array are defined by its class. That is, for each rank and type there is a different class. (This is necessary for traditional compiler optimizations, since Java does not support templates.) Supported types must include all of Java primitive types (boolean, byte, char, short, int, long, float, and double), one or more complex types (at least the Complex class in the VNI JSR - Add Complex Class to Java), and Object. Supported ranks must include 0- (scalar), 1-, 2-, 3-, 4-, 5-, 6- and 7-dimensional arrays. (Rank 7 is the current standard limit for Fortran.) The class for a d-dimensional array of type would be typeArraydD. As an example, the class for a two-dimensional Array of doubles would be named doubleArray2D. Array classes are final classes.

The array classes must fully support the concept of regular array sections. A regular array section corresponds to a subarray of another array, which we call the master array. Each element of an array section corresponds to exactly one element of its master array. The correspondence is one-to-one in the section to master direction. Referencing one element of an array section (for reading or writing) has the effect of referencing the corresponding element of the master array. Regular array sections have the same type as, and rank less than or equal to, their master arrays. Regular array sections behave exactly like regular arrays for all operations, including sectioning. (In fact, there are no separate classes for array sections.)

The array classes provide methods that implement Fortran-like functionality for arrays. In particular, the following operations must be provided:

Get and set the values of an array element, array regular section, or array irregular section.
Operations to query the rank and shape of an array.
Operations to reshape and transpose an array.
Elemental conversion functions (e.g., the equivalent of Fortran REAL and AIMAG, that convert complex arrays into double arrays).
Elemental transcendental functions.
Elemental relational functions (<, >, <=, >=, ==, !=).
Array reduction functions (sum, minval, etc.).
Array construction functions (merge, pack, spread, unpack).
Array manipulation functions (shift, spread).
Array location functions (maxloc, minloc).
Array scatter and gather operations.
Operations that correspond to array expressions (addition, multiplication, etc.)

Finally, it must be possible to cast Java arrays into multidimensional Array objects of the same rank and vice-versa. As an example, it must be possible to convert back and forth between doubleArray2D and double[][]. The casting operators create new copies of the data to prevent exposing the internal structure of the Array classes.

Quite often, Java code using the Array package will have to interface with native code. The native code can access the elements of a multidimensional Array using the following mechanism, explained through an example. Let there be a native static void foo(doubleArray2D). The C code for foo needs to do the following:

    foo(JNIEnv *env, jobject arr)
    {

        /*
         * Find the number of elements in the Array.
         */
        jsize len = GetMDArrayLength(env, arr);
        ...

        /*
         * Obtain a pointer to the elements of the Array.
	 * The elements are in increasing element order.
         */
        jdouble *data = GetDoubleMDArrayElements(env, arr);

	/*
         * Operate on data as desired.
         */
        ...

        /*
         * Release the Array elements when done.
         */
        ReleaseDoubleMDArrayElements(env, arr, data);
        return;
    }

Note that the elements extracted by GetDoubleMDArrayElements are ordered according to the logical element order previously specified. The actions performed by GetDoubleMDArrayElements and ReleaseDoubleMDArrayElements are implementation dependent. In some cases it may be possible to just return a pointer to the (pinned) storage of the Array elements, if that storage is already in logical element order. In other situations it may be necessary to copy the elements to a temporary storage. If the Array arr passed as a parameter is a section of another Array, then copy is in general the only alternative.

The array classes can be implemented with no changes in Java or JVM. However, it is essential that the get and set methods be implemented as efficiently as array indexing operations are in Fortran or in C. Multidimensional arrays are extremely common in numerical computing, and hence we expect that efficient multidimensional arrays classes will be heavily used.

2.2 What is the target Java platform? (i.e., desktop, server, personal, embedded, card, etc.)

The Array package is targeted at both the desktop and server platforms.

2.3 What need of the Java community will be addressed by the proposed specification?

Multidimensional arrays are the most common data structures in scientific and engineering computing. The Java Array package provides the ability to represent multidimensional arrays in Java programs. It supports a set of array operations that lead to concise representation and efficient optimization of scientific and engineering codes.

2.4 Why isn't this need met by existing specifications?

Native Java arrays are strictly one dimensional. Multidimensional arrays are simulated as "arrays of arrays". That means, for example, that each element of a double[][] array is a double[] array. Arrays of arrays are very general and therefore more difficult to optimize. For instance, for a double[][] a, rows a[i] and a[j] may have different lengths. Bounds checking optimization is an example of an optimization that can be better performed when arrays are known to be rectangular (as in true multidimensional array). Furthermore, the "array of arrays" approach opens up more possibility for aliasing. For instance, for a double[][] a and double[][] b, it is possible that a[i] and b[j] refer to the same row. In fact, it is possible that a[i] and a[j] (i != j) refer to the same row. Many advanced compiler optimizations rely on accurate aliasing disambiguation.

The java.vecmath package includes two classes that are relevant to this discussion: GVector and GMatrix. They implement one- and two-dimensional arrays of doubles. The purpose of these classes is similar in spirit to the Array package but they offer much more restricted functionality. GVector and GMatrix only support one- and two-dimensional arrays of doubles and they only offer very limited aggregate operations.

2.5 Please give a short description of the underlying technology or technologies:

Implementing multidimensional arrays as classes has been done before in the context of A++/P++ and POOMA. The mechanisms for mapping a multidimensional cartesian space into a single-dimensional address space are well understood. The main challenge is performance.

A good implementation of the Array package will deliver high performance in the execution of aggregate methods. That is, methods that operate on many elements, such as adding two array sections or multiplying two matrices. The techniques for performing efficient aggregate operations on matrices, in particular for linear algebra operations, are well described in the literature.

To deliver good performance on user codes that use elementary operations on array elements, two compiler technologies are of utmost importance:

Bounds checking optimization through versioning: This technique creates safe regions of code where all array accesses are guaranteed to be valid. Aggressive optimizations, that reorder code, can be applied in these regions.
Semantic expansion: With the array package, each elemental data access (set or get) operation is a method call. The method verifies that all the indices are valid (along each of the array axes) before proceeding to the actual data access. Exposing the semantics of these methods to the compiler is a necessary step for optimization. This is accomplished through the technique of semantic expansion.

2.6 Is there a proposed package name for the API Specification? (i.e., javapi.something, org.something, com.something, etc.)

javax.math.array

2.7 Does the proposed specification have any dependencies on specific operating systems, CPUs, or I/O devices that you know of?

No. A strawman implementation of the Array package was implemented entirely in Java.

2.8 Are there any security issues that cannot be addressed by the current security model?

No.

2.9 Are there any internationalization or localization issues?

No.

2.10 Are there any existing specifications that might be rendered obsolete, deprecated, or in need of revision as a result of this work?

No other existing specifications are affected by this specification. The functionality of classes GVector and GMatrix of java.vecmath will be largely superseded by the functionality of the equivalent classes (doubleArray1D and doubleArray2D, respectively) in the Array package. However, the interfaces are not compatible and for the benefit of programs using the Java 3D API both GVector and GMatrix should continue to be maintained.

There is a relationship between the specification of the Array package and of the Complex class (as being submitted by VNI). Complex arrays in the Array package must support the same elemental type as defined by the Complex class.

Section 3: Contributions

3.1 Please list any existing documents, specifications, or implementations that describe the technology. Please include links to the documents if they are publicly available.

A strawman for the Array package is available as package "com.ibm.math.array". It can be downloaded from "www.alphaWorks.ibm.com/tech/ninja".

A prototype research compiler that performs the optimizations of semantic expansion and bounds checking optimization has been implemented at IBM Research.

Two technical reports describing the design of the strawman Array package are available:

IBM Research Division RC21369. A Standard Java Array Package for Technical Computing. Jose Moreira, Sam Midkiff, Manish Gupta.
IBM Research Division RC21481. Java Programming for High Performance Numerical Computing. Jose Moreira, Sam Midkiff, Manish Gupta, Pedro Artigas, Marc Snir, Rick Lawrence.

Both these reports are available at "www.research.ibm.com/ninja".

3.2 Explanation of how these items might be used as a starting point for the work.

The strawman implementation for the Array package includes most of the functionality desired in the final product. The main features missing are reduction operations (sum and product of all elements of an array) and elementary math functions (sin, cos, exp, etc applied to all elements of an array.)

Section 4: Additional Information (Optional)

4.1This section contains any additional information that the submitting Participant wishes to include in the JSR.

[PLEASE_FILL_IN]

See Also
Details of the Community Process
New Specification Proposals
Open Calls for Experts

[ This page was updated: 06-May-99 ]

Feedback | Map | A-Z Index

For information, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.