HPCC and Java -- A Report by The Parallel Compiler Runtime Consortium (PCRC)


Proposed Writing Responsibilities

Northeast Parallel Architectures Center at Syracuse University:
1.1, 1.2 ,2.1, 2.2, 2.3, 2.7, 6.3, 8.2
Cooperating Systems:
4.1, 4.2 4.3
Indiana University:
1.4, 3.3, 6.2, 7.2, 7.4, 8.3
Rice University:
5.1, 5.2(shared), 5.3, 5.4
University of Maryland:
2.4, 2.5, 3.4, 8.1
University of Rochester:
2.6, 5.2(shared), 6.4, 7.1, 8.4
University of Texas:
1.3, 3.1, 3.2, 6.1, 7.3

General Approach:



Some comments from Geoffrey Fox:


Proposed Outline of Final Report

Note section numbers such as 2.1syr refer to text in original document and are hyperlinked to it

1: Introduction
1.1 What is Java and What is HPCC and why they need each other [SYRACUSE]
1.2 The HPCC Application Motivation for Java -- Meta-Challenges [SYRACUSE]
[original text: 2.1syr) Introduction]
First some preliminary remarks. We classify all (interesting) HPCC applications as belonging primarily to one of two domains: Grand Challenges or National Challenges -- the "HPCC-Challenges". Recognize that today's technology is blurring the distinctions between parallel and distributed processing, e.g., the use of ATM-LANs by networks of workstations/PCs. The "traditional" high-performance applications require finely tuned and optimized implementations of runtime support (intrinsics and communications) on their host processors. The current designer of runtime systems seeks to provide the application programmer with easier access the high performance potential of a given machine(s)/ architecture. Beyond speed, however, National Challenges can pose additional requirements such as trust or privacy, i.e., security, and here a runtime system designer must also consider metrics like "response time" or "transactions per second".
1.3 The Nature of Java Applications and their need for HPCC [TEXAS]
[original text:1.1tex) The Nature of Java Applications]
[original text:,1.7ind) Current Strengths and Weaknesses of Java]
[also cf:sec 2.7 below]
1.4 Lessons from and Relation to HPF HPC++ Fortran-M and other HPCC Software Systems [INDIANA]
[original text:3.1ind) Comparison with HPF]
[original text:3.2ind) Comparison with HPC++]
[original text:3.3ind) Operator Overloading]
[original text:4.4csc) Comparison with C++]
2: Further Details of Motivating Applications
[If gets too long, details can go into appendices]
2.1 General Discussion of Meta-Challeges [SYRACUSE]
[original text:2.2syr) Meta-Challenges]
We define a new class of applications called "Meta-Challenges". These are loosely coupled sets of physically distributed instances of Grand or National Challenges. A DoD example of a Meta-Challenge is Distributed Interactive Simulation (DIS). DIS involves the modeling of forces (tanks, aircraft, satellites), environments, rules of engagement and tactics, etc. Simulation control and monitoring for red, blue, and white players are provided. A simulation may need to grow with the addition of "green", "yellow", or neutral forces, and have to scale form peace-time/peace-keeping through low-intensity-conflict (LIC), to massive conflict. Requirements for considering chemical, biological, or nuclear (CBN) environments are needed, for example, modeling operations in a nuclear environment (OPINE).
2.2 An Example -- Distributed Interactive Simulation [SYRACUSE]
[original text:2.3syr) An Example Meta-Challenge: DIS]
Java offers a means to enhance the existing DIS multi-user protocol. A new DIS protocol, Advanced Distributed Simulation (ADS), is an attempt to introduce scalability. ADS breaks apart the DIS into an active-reactive part and an interactive-communicating part. Java is consistent with such an evolutionary approach. Java applets could update models on-the-fly, as with dynamic weather, battle damage, or threat-responsive counter-countermeasure techniques. One of the greatest challenges to the DIS designer is finding ways to interconnect independently built simulations having disparate modeling granularity (i.e., the ALS protocol, Aggregate Level Simulation), and then increase the number of models and improve fidelity, but with no-greater demand on network bandwidth. Weapon system design has also been evolving from integrating interoperable subsystems to influence management among ubiquitous supersystems. (An example of the latter would be Information Warfare (IW). Another is counter-OSINT --open source intelligence). BM/C3 is undergoing changes in definition: BM/C4 (computing), C3I with intelligence, and IS&R for surveillance and reconnaissance. Also, a fifth-C, C5I is sometimes added for emphasizing the collaboration among humans of any C3I system. (The relationship between an Air Traffic Controller, the ATC surveillance, tracking, processing and display system, and a pilot is another example). From a user's (Commander's) perspective, Java is extremely attractive for simulating (or even building) these systems since its roots are in the distributed, interactive, object-oriented, collaborative user domain. It is rapidly becoming the language of choice for programming the user-interface of on-line commercial multimedia systems.
[original text:2.4syr) Java for DIS]
Java is intended to be machine independent and offers to challenge the need for hegemony among systems. For example, the ideal computing architecture for simulating military battle management including command and control is multiple instances of "loosely coupled sets of tightly coupled processors." Such always poses a severe configuration management problem: multiple hardware, operating system, and programming languages. Interoperability among systems of systems and the interfacing between "stovepipe" systems is related challenge. Java could provide an approach for "wrapping" existing or planned systems with a functionality layer for interoperability while ameliorating the bandwidth problem by intelligently transmitting only those parts of a message (applet) that are immediately needed by an application (user).
2.3 An Example -- Manufacturing [SYRACUSE]
[no text]
2.4 An Example -- Remote Sensing and Data Fusion [UMD]
[original text:2.5umd) Remote sensing data fusion]
There are many applications that are likely to require the generation of data products derived from multiple sensor databases. Such databases could include both static databases (such as databases that archive LANDSAT/AVHRR images) and databases that are periodically updated (like weather map servers). A client would direct (High Performance) Java programs to run on the database servers. These HPJ programs would make use of the servers computational and data resources to carry out partial processing of data products. Portions of data processing that required the integration of data from several databases would be carried out on the client or via a HPJ program running on another computationally capable machine.
2.5 An Example -- Medical Interventional Informatics [UMD]
[original text:2.6umd) Interventional Epidemiology/ Medical crisis management]

We are considering scenarios involving medical emergencies associated with armed combat, terrorist actions, inadvertent releases of toxic chemicals, or major industrial plant explosions. In such scenarios, the goal is to rapidly acquire and analyze medical data as the crisis unfolds in order to determine how best to allocate scarce medical resources. This scenario involves the discovery and integration of multiple sources of data that might be in different geographical locations. We would For instance, one may need to determine the best diagnosis and treatment for patients exposed to combinations of chemicals whose effects have not been well characterized. This class of scenarios is motivated by the recent ARPA-sponsored National Research Council workshop on crisis management.

This involves discovering and Exploiting Clinical Patterns in Patient Populations. These scenarios involve study of health records to determine risk factors associated with morbidity and mortality for specific subpopulations (e.g. various substrata of armed forces personnel serving in a particular region). In such scenarios, the goal is to provide high quality, lower cost care by exploring diverse clinical data sources for patterns and correlations and by ensuring that significant data for individual patients is not overlooked. For instance, a system of this type could rapidly track the spread of serious infectious diseases such as Hepatitis B or HIV. It could also be used to identify specific soldiers whose demographic profile or past medical history might place them at particularly high risk of contracting the disease.

The systems software required for these medical scenarios should be very similar to the requirements motivated by the sensor data fusion scenario.

2.6 An Example -- Network-based data mining [ROCHESTER]
[new text from ROCHESTER:]

With large volumes of routine data having been collected, many organizations are increasingly turning to the extraction of useful information from such databases. Such high-level inference processes may provide information on patterns from large databases of particular interest to defense analysis.

Data mining is an emerging research area, whose goal is to extract significant patterns or interesting rules from such large databases. Data mining is in fact a broad area which combines research in machine learning, statistics and databases. It can be broadly classified into three main categories: Classification -- finding rules that partition the database into disjoint classes; Sequences -- extracting commonly occurring sequences in temporal data; and Associations -- find the set of most commonly occurring groupings of items.

At Rochester, we have worked on the problem of mining association rules in parallel over basket data on both distributed shared-memory multiprocessors and a network of machines. Basket data usually consists of a record per customer with a transaction date, along with items bought by the customer. Many real-world applications can be formulated in such a manner. Our preliminary experimental results on a SGI Power Challenge shared-memory multiprocessor machine and a DEC Memory Channel cluster are very encouraging.

The next step is to consider scenarios where the database server is accessible via the Internet. The machine independent nature of Java makes it ideally suited to writing both the server and client code. The server could be a collection of machines, i.e., it could be a single tightly-coupled multi-processor shared or distributed memory machine, it could be a network of workstations, or any combination of these with heterogeneous processing elements, and heterogeneous interconnection network. The client side could be a heterogeneous collection of machines as well. We need compiler and run-time support to manage both the client and the server. The compiler should be able to insert code to exploit the local network configurations on both ends, while the run-time system monitors dynamic behavior and executes different actions according to the transient information. We have implemented a client-server data mining algorithm in Java. The preliminary results will be discussed in more detail in Section 8.5.

Data mining algorithms tend to have program behaviors quite different from traditional Grand Challenge problems in terms of both data access patterns and control flow. Unlike simple array structures used in scientific code, the data structures used in mining algorithms include hash trees, hash tables, and linked lists. Consider mining for associations: the first step consists of finding all the sets of items (called itemsets), that occur in the database with a certain user-specified frequency (called minimum support). Such itemsets are called large itemsets. An itemset of k items is called a k-itemset. The general structure of algorithms for mining associations are as follows. The initial pass over the database counts the support for all 1-itemsets. The large 1-itemsets are used to generate candidate 2-itemsets. The database is scanned again to obtain occurrence counts for the candidates, and the large 2-itemsets are selected for the next pass. This iterative process is repeated for k = 3,4, ... till there are no more large k-itemsets to be found. Effective parallelization and networked computing require new research in algorithms, compilers and runtime systems.

2.7 Java based Web Server and Client Agents [SYRACUSE]
[An example of conventional Java application/applet needing HPCC -- filtering VRML for Visible Human database is one possibility -- cf: sec. 1.3]
3: Models for the HPJava Programming Environment
3.1 General Philosophy and Discussion of Possibilities [TEXAS]
[original text:1.2tex) Consistency with Current Language Philosophy and Feature Set]
[original text:1.3tex) Inferences]
[original text:4.2ind) Implications of Java Philosophy]
[cf: sec 1.3 above]
3.2 Object and Applet Parallelism [TEXAS]
[original text:1.2tex) Consistency with Current Language Philosophy and Feature set-- cf: sec 3.1]
[original text:1.3tex) Inferences-- cf: sec 3.1]
[original text:7.4.6ind) Applet view of HPCC Applications]
3.3 Task and Control Parallelism [INDIANA]
[original text:7.3.1ind) Limited Synchronization Capabilites of current Java]

Following the exact semantics of Fortran-M channels might be a challenge in java because of the limited synchronization features of java

[My impression is that the synchronisation facilities in Java are reasonably complete--once you track them down. DBC]

3.4 Data Parallelism [UMD]
[original text:1.6tex) What should NOT be done]
[original text:1.6ind) Is Data Parallelism relevant for Java]
4: Possible Java Language Enhancements and Support Classes for HPCC
4.1 Current Java Features for Concurrent and Distributed Computing [CSC]
[original text:1.4tex) Current Features for Specification of Parallel Execution Behaviour]
[original text:1.4ind) Current Features of Java]
[original text:1.5ind) Extended Synchronization in Java]
[original text:1.7ind) Current Strengths and Weaknesses of Java -- cf: sec 1.3]
4.2 Language Enhancements [CSC]
[original text:1.5tex) A Proposal for Additional Specification of Parallelism in Java]
[original text:1.5ind) Extended Synchronization in Java -- cf: sec 4.1]
[original text:4.2csc) Extensions to the Java language ]
[original text:4.3csc) Implications of Immature Language]
[original text:4.2ind) Implications of Java Philosophy -- cf: sec 3.1]
4.3 Class and Interface Support [CSC]
[original text:4.1csc) Extensions via class libraries]
[original text:4.5ind) High Performance Classes]
5. Compilation/Interpretation Issues
5.1. Compiled vs. interpreted [RICE]
[original text:6.1rice) Compiled vs. Interpreted]
[original text:6.1ind) Performance Issues]
[original text:5.1roc) Generate native code for shared-memory architectures]
5.2. Compilation issues [RICE and ROCHESTER]
[original text:6.2rice) Compilation issues]
[original text:5.2roc) Java vs Bytecode -- omitted]
[new text from ROCHESTER:]

The current Java compiler takes the hybrid approach of compilation and interpretation. At Rochester, we would like to exploit more options to generate efficient code for High Performance Java applications. We consider three scenarios:

(i) Compiling Java to bytecode.
In this approach HPCC Java could be interpreted. The compiler would generate Java VM code suitable for HPCC problems. In addition to HPCC-specific optimizations, this may require generating (or applying user-provided) annotations. There is no consensus yet, whether using HPCC-specific annotations is desirable.
(ii) Compiling Java to native code.
Here, we would treat Java programs in the same way as we treat programs in other languages (C++, Fortran). Due to the dynamic nature of Java, different approaches are possible when all source files are available and when we compile one class at a time. In the first case we may be able to eliminate some overheads (e.g., type checks, array range checks, virtual method invocations). In the latter case, only local optimizations would be possible.
This scenario gives us the most flexibility, but is not acceptable in some circumstances. In "network computing" we want applications to be distributed in the bytecode format. In that case we have to use approach (i) or (iii) or a combination of both. An orthogonal issue is how to handle java class libraries, which (if any) are implemented implicitly in the interpreter, these will probably have to be implemented in some HLL or directly in native code. This is also an issue in scenario (iii) below.
(iii) Compiling bytecode to native code.
This seems to promise a lot of power. We expect that interpreting the bytecode (even with quick variants of VM instructions or localized just-in-time compilation) will be inherently slower than running native code. Our early experiments with data mining in Java have confirmed this observation (cf. Sec. 8.5). This is especially true if we can compile a large part of the application (ideally, the whole application) at the same time. Then, we can apply optimizations like inlining or compile-time type checking which otherwise would be available only in scenario (ii).
A variant of this technique would transform bytecode to bytecode with the addition of quick, native methods for common operations and native implementation of common data structures (cf. Sec. 7.1). Such data structures may include arrays or lists. Such a data structure could be implemented with the internal representation which could be used directly by native methods implementing most common operations. An interface would be provided to access the data structure from Java.

At Rochester, we are implementing compiler infrastructure with multiple front- and back-ends which will uniformly treat all scenarios mentioned above (cf. Section 8.4). We expect to include traditional optimizations like cache optimizations, register allocations etc.

5.3. Interaction with runtime [RICE]
[original text:6.3rice) Interaction with runtime]
5.4. Tools interface [RICE]
[original text:6.4rice) Tools Interface]
6. Runtime Support
6.1. Dynamic communication and distributed nameservers [TEXAS]
[original text:7.1.1tex) Communication Requirements of Java Applications]
[original text:7.1.2tex) The Current Model of Communication]
[original text:7.1.3tex) Desirable Properties of Model of Communication for Java]
[original text:7.1.4tex) Proposed Model of Communication]
[original text:7.1.5tex) Integration/ Implementation Concepts]
[original text:7.1.6tex) Comments]
6.2. Parallel and conventional CORBA interface to Java [INDIANA]
[original text:7.2.1ind) CORBA Model]
[original text:7.2.2ind) Current Actitivities]
[original text:7.2.3ind) HPJava Issues]
6.3. Low level runtime issues -- PCRC Distributed Memory [SYRACUSE with Ranka at FLORIDA]
[original text:8.1.1syr) Introduction]

We consider using Java to also provide low-level runtime support for the Meta-Challenges and distributed Interactive Simulations (DIS) discussed above as well as the traditional Grand and National Challenges: coding the high-performance runtime system in Java (i.e., traditional PCRC-like, including Parti/Chaos). There are two basic sets of issues that a runtime system designer must consider for traditional HPC challenges:

[original text:8.1.2syr) Features for PCRC Runtime in Java]

The Syracuse University Fortran 90D runtime consists of about 300 routines/functions. We expect that number to reduce to less than 150 in the newer PCRC implementation by for handling regular computations and communications. We add to this the estimate about 60 functions from the Maryland CHAOS runtime for handling the irregular case. At present our Fortran 90D runtime (including its use of Parti/Chaos) requires only 23 different MPI functions. The same will hold for the newer PCRC version. We claim the following:

[original text:8.1.3syr) Issues for PCRC Runtime in Java]

As we re-implement the PCRC runtime system for regular and irregular distributed arrays, we must address the following design issues:

[original text:8.2.4ind) Java Distributed Memory Model]
6.4 Low level runtime issues -- Threads [ROCHESTER]

[original text:8.2.1roc) How to implement the thread packages on which Java is based across a range of architectures]

[original text:8.2.2roc) How to unifiy shared-memory and message-passing portions of web-spanning applications.]

[new text from ROCHESTER:]

Threads in Java are supported for concurrency. This is useful as perceived by a user, when response time is important, e.g. the HotJava browser where you can scroll a page while down-loading an image. It currently does not support threads running on separate processors. Our main objective in this section is to support parallel threads, at no expense to the Java programmer. We would prefer not to change, modify or limit the language as perceived by the user.

There seem to be two possible scenarios:

1) Replace the current Java thread package: implement a completely new native threads package for each architecture to be supported. Use this package instead of the Java thread class.
The advantage of this approach is that no special compiler support is required. It is less susceptible to changes in the language. The disadvantage is that this might not be possible. If the interpreter actually implicitly "implements" threads where it just branches from one context to another on a yield or a scheduling call, then replacement might not be feasible. This obviously requires a more complete knowledge of how the interpreter behaves w.r.t thread library calls.
2) Add a new parallel thread package: implement a parallel thread package for each architecture that is independent of the Java thread class. Maintaining the original classes may be useful for relatively cheap multi-threading on a single processor and also we may not have a choice!
The advantage is that we may be able to implement the parallel threads mechanism independent of the interpreter to the extent that a Java compiler can insert calls to the parallel threads package directly when appropriate. The disadvantage is that the compiler has to do more work distinguishing between parallel threads and normal Java threads and when it is legal to replace the latter with the former.

At Rochester, we are particularly interested in implementing a parallel thread package for Java with special compiler support on distributed shared-memory (DSM) machines. On scalable multiprocessors, communication is much more expensive than computation. Many different protocols have been proposed for behavior-driven locality management. None of them work best for all types of data sharing. Performance would be increased if the system could adapt to the program's memory access patterns. An example could be the use of multithreading to alleviate memory latency (especially on remote accesses) aggravated by delays introduced by the memory coherency protocol. We propose to analyze programs at compile time and annotate them with data access summary information. Exploiting this information will require the ability to switch among multiple data coherence protocols at run time.

[original text:8.2.3ind) Performance of Current Java Threads]


7. Architecture Issues for Integration of Java with HPCC
7.1. Shared Memory Architecture Issues [ROCHESTER]
[original text:5.1roc) Generate native code for shared-memory architectures -- cf: sec 5.1]
[original text:5.2roc) Java vs Bytecode -- cf: sec 5.2]
[original text:5.3roc) Locality optimizations for (distributed) shared-memory architectures]
[original text:5.4roc) How to integrate shared-memory environments across a heterogenous collection of machines and interconnects, respecting dramatically different latencies and bandwidths]
7.2. Heterogeneity [INDIANA]
[original text:5.5roc) Compiling Java for heterogenous networks of machines]
[original text:5.6ind) General Issues]
7.3. Fault-tolerance and Reconfigurability [TEXAS]
[original text:7.1.6tex) Comments -- cf: sec 6.1]
7.4 Security [INDIANA]
[original text:7.2.3ind) HPJava Issues -- cf: sec 6.2]
8. Some Early Experiements.
Details could go into appendices
8.1. At Maryland -- Maryland Responsible for Name(s) [UMD]
[original text:7.4.1umd) Mobile Programs]

The intent is to provide a single unified framework for remote access and remote execution in the form of itinerant programs. Itinerant programs can execute on and move between the user's machine, the servers providing the datasets to be processed or third-party compute servers available on the network. The motion is not just client-server; it can also be between servers. Programs move either because the current platform does not have adequate resources or because the cost of computation on the current platform or at the current location in the network is too high. Various parts of a single program can be executed at different locations, depending upon cost and availability of resources as well as user direction. The architecture would also provide support for plaza servers. Plaza servers provide facilities for (1) execution of itinerant programs, (2) storage of intermediate data, (3) monitoring cost, usage and availability of local and remote resources on behalf of itinerant programs and alerting them if these quantities cross user-specified bounds, (4) routing messages to and from itinerant programs, and (5) value-added access to other servers available on the network. Examples of value-added functionality include conversions to and from standard formats and operations on data in its native format. In addition to allowing cost and performance optimization, itinerant programs and plaza servers also provide a flexible framework for extending the interface of servers, in particular legacy servers with large volumes of data as well as servers that are not under the control of the user.

[original text:7.4.2umd) Coupling sequential Java Programs]

MPI should be bound to Java so that Java programs can communicate by message passing. We believe that applications will require an ability to process periodically generated data. Programming to carry this out could be written in MPI, however, a higher level library might prove to be useful.

Consider long-running itinerant programs that process periodically generated data; each program processes sequences of data from multiple sources with possibly different periodicity. An itinerant program can either visit each of the data sources in order or it can install a surrogate at each data source, which is activated every time the dataset is updated. A surrogate acts as a filter for the sequence of data generated by the data source and communicates appropriate updates to the parent itinerant program (possibly after preliminary processing). For complex processing on a number of datasets, a hierarchy of surrogates and result combining programs can be created. The result of such itinerant programs can either be a single accumulated result or a sequence of results. What we have in mind is a scheme that is an extension of our existing scheme for mapping and synchronizing multiple update sequences, which has been used in our program coupling effort. This scheme has been used to dynamically link multiple physical simulations being performed concurrently.

[original text:7.4.3umd) Coupling HPH programs with one another and with other data parallel programs (e.g. MPI, HPF, HPC++)]

We consider the problem of efficiently coupling multiple data-parallel programs at runtime. We propose an approach that establishes mappings between data structures in different data-parallel programs and implements a user-specified consistency model. Mappings are established at runtime and can be added and deleted while the programs being coupled are in execution. Mappings, or the identity of the processors involved, do not have to be known at compile-time or even link-time. Programs can be made to interact with different granularities of interaction without requiring any re-coding. A-priori knowledge of consistency requirements allows buffering of data as well as concurrent execution of the coupled applications. Efficient data movement is achieved by pre-computing an optimized schedule. (This actually is already PCRC work and will appear in a paper at the ICS conference in May).

8.2. At Syracuse -- Syracuse Responsible for Section Name(s) [SYRACUSE]
[original text:7.4.4syr) WebWork: Integrated Programming Environment Tools for National and Grand Challenges]

is our joint work with Boston University and Cooperating systems with paper written before implications of Java as clear as they are now!

[original text:7.4.5syr) Computational Web Multiserver - a proposal for Web based HPCC]

updates some of the ideas in WebWork

8.3. At Indiana -- Indiana Responsible for Name
[original text:8.2.5ind) Implications of Current Indiana Research on HPC++ and CORBA]

[new text from ROCHESTER:]

8.4 (Rochester) Early Experiments -- Java Compiler Infrastructure [ROCHESTER]

As part of our effort to better understand the compilation issues, at Rochester we are developing a Java compiler which could be used to optimize Java programs.

The compiler would use a common intermediate representation and a collection of front- and back-ends to facilitate re-use of components and independence of optimization passes.

There are two possible front-ends: one would read Java source, the other one would read bytecode.

A set of back-ends would include Java source generation, bytecode generation, and "native" code generation (either in form of some target HLL like C, or actual native code for some architectures).


                                  /=====> Java
Java       ----\                 /
                ======> IR  =====-------> bytecode
bytecode   ====/                 \
                                  \-----> C (or other "native")

Our current development concentrates on a bytecode front-end and Java back-end (shown above with double lines). The choice was made, because using bytecode as an input is more general (if we have access to source, we also have access to its bytecode form -- the converse is not true). Using Java as output has a big advantage in the development phase.

The implementation is not complete: only a subset of legal bytecode constructs is implemented and the definition of our IR is not stable. However, preliminary results are promising. We are able to identify implicit high level constructs such as loops with good success. Also our high level IR makes implementation of code transformation relatively simple.

Code instrumentation is our first application of the compiler framework. We are developing a pass which inserts instrumentation calls. Currently we are instrumenting loops, if-then-else statements, and methods. We are planning to look into the performance of a range of Java applications. The objective is to determine if there are any performance issues that are unique to Java the language (the fact that it is interpreted), and/or applications written in Java (it could be the case that only certain kinds of applications are worth writing in Java).

An Example.

We are given a bytecode representation of a method corresponding to the
following disassembled code.

Method void paintAll(java.awt.Graphics)
   0 iconst_0
   1 istore_2
   2 goto 19
   5 aload_0
   6 aload_0
   7 getfield #7 <Field PCRCexample.CellList [I>
  10 iload_2
  11 iaload
  12 aload_1
  13 invokevirtual #6 <Method PCRCexample.plotCell(ILjava/awt/Graphics;)V>
  16 iinc 2 1
  19 iload_2
  20 aload_0
  21 getfield #4 <Field PCRCexample.n I>
  24 if_icmplt 5
  27 aload_0
  28 aload_1
  29 invokevirtual #3 <Method PCRCexample.updateStatusLine(Ljava/awt/Graphics;)V
>
  32 return

We can parse the class file and recover a high-level intermediate representation which corresponds to the following source (this was generated automatically by our Java-source back-end).


   public void paintAll (
         java.awt.Graphics a1
      )
   {
      int v0;
      v0 = 0;
      while((v0 < this.n)) {
         this.plotCell(this.CellList[v0], a1);
         v0 = (v0 + 1);
      } //while
      this.updateStatusLine(a1);
      return;
   } // paintAll

Since our IR explicitly identifies high-level constructs (e.g.,loops), it is easy to instrument the code and/or perform optimizing transformations.

Section 8.5 (Rochester) Early Experiments -- Network-based Data Mining [ROCHESTER]

We have implemented a preliminary version of network-based data mining at Rochester. Currently, the server is a passive entity in terms of the data mining computation. It's only job is to accept client connections and ship off requested data. The mining process is carried out at the client end. Both the client and server are currently sequential.

We now present some timing results vis-a-vis a C++ implementation of the algorithm which doesn't transmit any data over the network. Below we show two synthetic databases from IBM we looked at:


Database      Num_Trans Avg_Trans_Size  Total_Size
-----------------------------------------------------------
T5.I2.D100K   100,000         5        2.615 Mbytes
T10.I4.D100K  100,000         10       4.31 Mbytes
These database are accessible only to the server through a
file-server. 

We now present the total execution time for the algorithm. The
Java_Time includes the server disk-read time, transmission time and
client computation time. C++_Time only gives the disk-read time and
computation time. 
Database      Total_Size    Java_Time   C++_Time  
--------------------------------------------------
T5.I2.D100K   2.615 Mbytes    367.7s    15.3s     
T10.I4.D100K  4.31 Mbytes     1847.2s   78.9s

The break-up of the execution time is as follows:
Database    Total_Size  Java:Reading Computation C++:Reading Computation  
------------------------------------------------------------------------
T5.I2.D100K   2.615 Mbytes    292.6s   75.1s     8.8s     6.5s  
T10.I4.D100K  4.31 Mbytes     478.4s   1368.8s   13.8s    65.1s

In Java the net disk-read time and shipping time for data is at the rate of
9Kb/sec vs. 0.31Mb/sec disc read time in C++.

From the experiments, we make two observations:

We will address these issues in our future research.