HPCC and Java -- A Preliminary Report by The Parallel Compiler Runtime Consortium (PCRC)

Find this at: http://www.npac.syr.edu/users/gcf/hpjava.html

Latest Draft is:http://www.npac.syr.edu/users/gcf/hpjava3.html


General Approach:



Four areas are available in this document as listed below -- Click to find details!

Initial Table of Suggested Relevant Areas
General Comments are available
Current List of Section Numbers
Full Collection of Initial Text


Initial Table of Suggested Relevant Areas

These are plans made at the Arpa PI meeting february 15 1996

1. What are Suitable Models for Parallel Computing and Java
speculations from Browne
comments fom Indiana
2. Motivating Applications for HPCCJava
distributed simulation etc.
Syracuse and Maryland initial words
3. Lessons from and Relations to HPF/HPC++
Indiana
See also Maryland section 7.4.3
4. Language Enhancements(Annotations) and Classes to support HPCC and Java
Marina Chen, Jim Cowie initial words
Indiana Comment.
5. Architecture (Shared, Distributed Shared, and Distributed) Issues for HPCCJava
Rochester will supply initial words
Indiana Comment
6. Compilers and Java -- Both conventional and HPCC and on client(interpreter) and Server(javac)
Rice will give initial words
Indiana comment
7. High Level Runtime Issues
7.1 Dynamic Communication and Distributed Nameservers
Browne speculations
7.2 Parallel and Conventional CORBA Interface to Java
Gannon Initial words
7.3 Task Parallelism and Java for both Integration(Coarse grain) and fine grain examples
Fortran-M CC++, "channel" classes in Java (see Indiana remark)
Relation to MetaChaos and PCRC Interoperability
Maryland will give initial words -- see sections 7.4.2umd and 7.4.3umd
7.4 HPCC and Compute-Webs (Use of World Wide Web MetaComputer)
HPCC applications as "applets" -- Indiana comment
Maryland Initial words
Boston/CSC/Syracuse may add
7.5 HPCC Java and implications for security
Brokers as interfaces to Java native HPCC related classes
Gannon initial words: see the CORBA discussion in Section 7.2
8 Low Level Runtime Issues
8.1 The Current PCRC Runtime (Regular and Irregular) Implementation in Java
Full (portable) Java version
What is smallest set of PCRC routines that are needed
to be done in native mode (i.e. current C)
Syracuse will provide initial words
8.2 Distributed Memory Versions of Java Runtime and use of HPCC thread packages on SMP or distributed memory to support Java
Rochester and Gannon will get initial words here

back General Comments



back Current List of Section Numbers

1. What are Suitable Models for Parallel Computing and Java
2. Motivating Applications for HPCCJava
3. Lessons from and Relations to HPF/HPC++
See also Maryland section 7.4.3
4. Language Enhancements(Annotations) and Classes to support HPCC and Java
5. Architecture (Shared, Distributed Shared, and Distributed) Issues for HPCCJava
6. Compilers and Java -- Both conventional and HPCC and on client(interpreter) and Server(javac)
7. High Level Runtime Issues
7.1 Dynamic Communication and Distributed Nameservers
7.2 Parallel and Conventional CORBA Interface to Java
7.3 Task Parallelism and Java for both Integration(Coarse grain) and fine grain examples
Fortran-M CC++, "channel" classes in Java
Relation to MetaChaos and PCRC Interoperability
Maryland will give initial words -- see sections 7.4.2umd and 7.4.3umd
7.4 HPCC and Compute-Webs (Use of World Wide Web MetaComputer)
This is currently a collection of exemplar systems/ideas:
7.5 HPCC Java and implications for security
Brokers as interfaces to Java native HPCC related classes
Gannon initial words: see the CORBA discussion in Section 7.2

8 Low Level Runtime Issues

8.1 The Current PCRC Runtime (Regular and Irregular) Implementation in Java
8.2 Distributed Memory Versions of Java Runtime and use of HPCC thread packages on SMP or distributed memory to support Java


back Full Collection of Initial Text

back 1. What are Suitable Models for Parallel Computing and Java


Texas Contribution (28 feb 96):

back 2. Motivating Applications for HPCCJava

Syracuse Contribution (23 feb 96):

2.0 Motivating Applications for HPJava: Meta-Challenges

2.1Syr) Introduction: First some preliminary remarks. We classify all (interesting) HPCC applications as belonging primarily to one of two domains: Grand Challenges or National Challenges -- the "HPCC-Challenges". Recognize that today's technology is blurring the distinctions between parallel and distributed processing, e.g., the use of ATM-LANs by networks of workstations/PCs. The "traditional" high-performance applications require finely tuned and optimized implementations of runtime support (intrinsics and communications) on their host processors. The current designer of runtime systems seeks to provide the application programmer with easier access the high performance potential of a given machine(s)/ architecture. Beyond speed, however, National Challenges can pose additional requirements such as trust or privacy, i.e., security, and here a runtime system designer must also consider metrics like "response time" or "transactions per second".

2.2Syr) Meta-Challenges: We define a new class of applications called "Meta-Challenges". These are loosely coupled sets of physically distributed instances of Grand or National Challenges. A DoD example of a Meta-Challenge is Distributed Interactive Simulation (DIS). DIS involves the modeling of forces (tanks, aircraft, satellites), environments, rules of engagement and tactics, etc. Simulation control and monitoring for red, blue, and white players are provided. A simulation may need to grow with the addition of "green", "yellow", or neutral forces, and have to scale form peace-time/peace-keeping through low-intensity-conflict (LIC), to massive conflict. Requirements for considering chemical, biological, or nuclear (CBN) environments are needed, for example, modeling operations in a nuclear environment (OPINE).

2.3Syr) An Example Meta-Challenge: DIS: Java offers a means to enhance the existing DIS multi-user protocol. A new DIS protocol, Advanced Distributed Simulation (ADS), is an attempt to introduce scalability. ADS breaks apart the DIS into an active-reactive part and an interactive-communicating part. Java is consistent with such an evolutionary approach. Java applets could update models on-the-fly, as with dynamic weather, battle damage, or threat-responsive counter-countermeasure techniques. One of the greatest challenges to the DIS designer is finding ways to interconnect independently built simulations having disparate modeling granularity (i.e., the ALS protocol, Aggregate Level Simulation), and then increase the number of models and improve fidelity, but with no-greater demand on network bandwidth. Weapon system design has also been evolving from integrating interoperable subsystems to influence management among ubiquitous supersystems. (An example of the latter would be Information Warfare (IW). Another is counter-OSINT --open source intelligence). BM/C3 is undergoing changes in definition: BM/C4 (computing), C3I with intelligence, and IS&R for surveillance and reconnaissance. Also, a fifth-C, C5I is sometimes added for emphasizing the collaboration among humans of any C3I system. (The relationship between an Air Traffic Controller, the ATC surveillance, tracking, processing and display system, and a pilot is another example). From a user's (Commander's) perspective, Java is extremely attractive for simulating (or even building) these systems since its roots are in the distributed, interactive, object-oriented, collaborative user domain. It is rapidly becoming the language of choice for programming the user-interface of on-line commercial multimedia systems.

2.4Syr) Java for DIS: Java is intended to be machine independent and offers to challenge the need for hegemony among systems. For example, the ideal computing architecture for simulating military battle management including command and control is multiple instances of "loosely coupled sets of tightly coupled processors." Such always poses a severe configuration management problem: multiple hardware, operating system, and programming languages. Interoperability among systems of systems and the interfacing between "stovepipe" systems is related challenge. Java could provide an approach for "wrapping" existing or planned systems with a functionality layer for interoperability while ameliorating the bandwidth problem by intelligently transmitting only those parts of a message (applet) that are immediately needed by an application (user).

Maryland Contribution(22 feb 96):
2.5umd) Remote sensing data fusion: There are many applications that are likely to require the generation of data products derived from multiple sensor databases. Such databases could include both static databases (such as databases that archive LANDSAT/AVHRR images) and databases that are periodically updated (like weather map servers). A client would direct (High Performance) Java programs to run on the database servers. These HPJ programs would make use of the servers computational and data resources to carry out partial processing of data products. Portions of data processing that required the integration of data from several databases would be carried out on the client or via a HPJ program running on another computationally capable machine.

2.6umd) Interventional Epidemiology/ Medical crisis management: We are considering scenarios involving medical emergencies associated with armed combat, terrorist actions, inadvertent releases of toxic chemicals, or major industrial plant explosions. In such scenarios, the goal is to rapidly acquire and analyze medical data as the crisis unfolds in order to determine how best to allocate scarce medical resources. This scenario involves the discovery and integration of multiple sources of data that might be in different geographical locations. We would For instance, one may need to determine the best diagnosis and treatment for patients exposed to combinations of chemicals whose effects have not been well characterized. This class of scenarios is motivated by the recent ARPA-sponsored National Research Council workshop on crisis management.

This involves discovering and Exploiting Clinical Patterns in Patient Populations. These scenarios involve study of health records to determine risk factors associated with morbidity and mortality for specific subpopulations (e.g. various substrata of armed forces personnel serving in a particular region). In such scenarios, the goal is to provide high quality, lower cost care by exploring diverse clinical data sources for patterns and correlations and by ensuring that significant data for individual patients is not overlooked. For instance, a system of this type could rapidly track the spread of serious infectious diseases such as Hepatitis B or HIV. It could also be used to identify specific soldiers whose demographic profile or past medical history might place them at particularly high risk of contracting the disease.

The systems software required for these medical scenarios should be very similar to the requirements motivated by the sensor data fusion scenario.

back 3. Lessons from and Relations to HPF/HPC++


See also Maryland section 7.4.3


Indiana Contributions 24 feb

back 4. Language Enhancements(Annotations) and Classes to support HPCC and Java

Cooperating Systems Contribution (29 feb 96)

Language extensions to Java can be broadly put into two categories, each with its own pros and cons.

Indiana Contribution (24 feb 96)


back 5. Architecture (Shared, Distributed Shared, and Distributed) Issues for HPCC Java


Rochester Contribution(28 Feb 96):

back 6. Compilers and Java -- Both conventional and HPCC and on client(interpreter) and Server(javac)


Rice Contribution(27 feb 96):

back 7. High Level Runtime Issues


back 7.1 Dynamic Communication and Distributed Nameservers


Texas Contribution (29 feb 96):


back 7.2 Parallel and Conventional CORBA Interface to Java

Indiana Contribution (24 feb 96):


back 7.3 Task Parallelism and Java for both Integration(Coarse grain) and fine grain examples


see sections 7.4.2umd and 7.4.3umd


Indiana Contribution (24 feb 96):

7.3.1)Limited Synchronization Capabilities of current Java: Following the exact semantics of Fortran-M channels might be a challenge in java because of the limited synchronization features of java.

back 7.4 HPCC and Compute-Webs (Use of World Wide Web MetaComputer)

There are several important possibilities here -- exemplars follow:


Maryland Contribution(22 feb 96):
7.4.1umd)Mobile Programs: The intent is to provide a single unified framework for remote access and remote execution in the form of itinerant programs. Itinerant programs can execute on and move between the user's machine, the servers providing the datasets to be processed or third-party compute servers available on the network. The motion is not just client-server; it can also be between servers. Programs move either because the current platform does not have adequate resources or because the cost of computation on the current platform or at the current location in the network is too high. Various parts of a single program can be executed at different locations, depending upon cost and availability of resources as well as user direction. The architecture would also provide support for plaza servers. Plaza servers provide facilities for (1) execution of itinerant programs, (2) storage of intermediate data, (3) monitoring cost, usage and availability of local and remote resources on behalf of itinerant programs and alerting them if these quantities cross user-specified bounds, (4) routing messages to and from itinerant programs, and (5) value-added access to other servers available on the network. Examples of value-added functionality include conversions to and from standard formats and operations on data in its native format. In addition to allowing cost and performance optimization, itinerant programs and plaza servers also provide a flexible framework for extending the interface of servers, in particular legacy servers with large volumes of data as well as servers that are not under the control of the user.

7.4.2umd) Coupling sequential Java Programs: MPI should be bound to Java so that Java programs can communicate by message passing. We believe that applications will require an ability to process periodically generated data. Programming to carry this out could be written in MPI, however, a higher level library might prove to be useful.

Consider long-running itinerant programs that process periodically generated data; each program processes sequences of data from multiple sources with possibly different periodicity. An itinerant program can either visit each of the data sources in order or it can install a surrogate at each data source, which is activated every time the dataset is updated. A surrogate acts as a filter for the sequence of data generated by the data source and communicates appropriate updates to the parent itinerant program (possibly after preliminary processing). For complex processing on a number of datasets, a hierarchy of surrogates and result combining programs can be created. The result of such itinerant programs can either be a single accumulated result or a sequence of results. What we have in mind is a scheme that is an extension of our existing scheme for mapping and synchronizing multiple update sequences, which has been used in our program coupling effort. This scheme has been used to dynamically link multiple physical simulations being performed concurrently.

7.4.3umd) Coupling HPJ programs with one another and with other data parallel programs (e.g. MPI, HPF, HPC++): We consider the problem of efficiently coupling multiple data-parallel programs at runtime. We propose an approach that establishes mappings between data structures in different data-parallel programs and implements a user-specified consistency model. Mappings are established at runtime and can be added and deleted while the programs being coupled are in execution. Mappings, or the identity of the processors involved, do not have to be known at compile-time or even link-time. Programs can be made to interact with different granularities of interaction without requiring any re-coding. A-priori knowledge of consistency requirements allows buffering of data as well as concurrent execution of the coupled applications. Efficient data movement is achieved by pre-computing an optimized schedule. (This actually is already PCRC work and will appear in a paper at the ICS conference in May).

Syracuse Contribution(11 feb 96):

7.4.4syr) WebWork: Integrated Programming Environment Tools for National and Grand Challenges is our joint work with Boston University and Cooperating systems with paper written before implications of Java as clear as they are now!

7.4.5syr) Computational Web Multiserver - a proposal for Web based HPCC
updates some of the ideas in WebWork


Indiana Comment (24 feb 96):

back 7.5 HPCC Java and implications for security

see the CORBA discussion in Section 7.2


back 8 Low Level Runtime Issues


back 8.1 The Current PCRC Runtime (Regular and Irregular) Implementation in Java


Syracuse Contribution(23 Feb 96):
8.1.1syr) Introduction: We consider using Java to also provide low-level runtime support for the Meta-Challenges and distributed Interactive Simulations (DIS) discussed above as well as the traditional Grand and National Challenges: coding the high-performance runtime system in Java (i.e., traditional PCRC-like, including Parti/Chaos). There are two basic sets of issues that a runtime system designer must consider for traditional HPC challenges:

8.1.2syr) Features for PCRC Runtime in Java: The Syracuse University Fortran 90D runtime consists of about 300 routines/functions. We expect that number to reduce to less than 150 in the newer PCRC implementation by for handling regular computations and communications. We add to this the estimate about 60 functions from the Maryland CHAOS runtime for handling the irregular case. At present our Fortran 90D runtime (including its use of Parti/Chaos) requires only 23 different MPI functions. The same will hold for the newer PCRC version. We claim the following:

8.1.3syr) Issues for PCRC Runtime in Java: As we re-implement the PCRC runtime system for regular and irregular distributed arrays, we must address the following design issues:


back 8.2 Distributed Memory Versions of Java Runtime and use of HPCC thread packages on SMP or distributed memory to support Java

Rochester Contribution (28 feb 96):

Indiana Contribution24 feb 96):