HPCC and Java -- A Report by The Parallel Compiler Runtime Consortium (PCRC)
- Northeast Parallel Architectures Center at Syracuse University: Geoffrey Fox,Bryan Carpenter, Wojtek Furmanski, Don Leskiw, Xiaoming Li
- Cooperating Systems: Marina Chen, James Cowie
- Harvard University: Thomas Cheatham, Amr Fahmy
- Indiana University: Dennis Gannon, Rajeev Raje
- Rice University: Ken Kennedy, Chuck Koelbel, John Mellor-Crummey
- University of Maryland: Joel Saltz, Alan Sussman
- University of Rochester: Wei Li, Michael Scott
- University of Texas: Austin: Jim Browne
Proposed Writing Responsibilities
- Northeast Parallel Architectures Center at Syracuse University:
- 1.1, 1.2 ,2.1, 2.2, 2.3, 2.7, 6.3, 8.2
- Cooperating Systems:
- 4.1, 4.2 4.3
- Indiana University:
- 1.4, 3.3, 6.2, 7.2, 7.4, 8.3
- Rice University:
- 5.1, 5.2(shared), 5.3, 5.4
- University of Maryland:
- 2.4, 2.5, 3.4, 8.1
- University of Rochester:
- 2.6, 5.2(shared), 6.4, 7.1, 8.4
- University of Texas:
- 1.3, 3.1, 3.2, 6.1, 7.3
General Approach:
- The preliminary planning and proposed draft is available -- this contains lots of useful words and references(links)
- Current discussion is available using mailing list pcrc-java@npac.syr.edu
or with Newsgroups and database support on our SP2.
- March 13 :2 deep outline with assignments proposed
- Mid April -- all assignments complete -- editing etc. starts
- Draft must be ready for public but friendly reading by CRPC meeting in Mid May
- Based on the manifesto we will call a meeting to decide what areas PCRC
will focus on
Some comments from Geoffrey Fox:
- There is some debate as to name we should use for the extensions of Java
to link it to HPCC. Possibilities suggested are:
- HPCC and Java
- HPCCJava , HPJ or
- HPJava
- Java-powered HPCC or HPCC powered Java
- Parallel Computing and Java
- Unless changes are suggested, I will use "HPCC and Java" in high-level places likes titles and the acronym HPJava where nifty short designations are appropriate.
- The discussion of data parallelism in original contributions showed some
disagreement as to whether this is a good, bad, feasible, infeasible idea.
I suggest that this particular document make no decisions but simply state
the pros and cons of particular approachs in the short and long term.
Proposed Outline of Final Report
Note section numbers such as 2.1syr refer to text in original document and
are hyperlinked to it
- 1: Introduction
- 1.1 What is Java and What is HPCC and why they need each other [SYRACUSE]
- 1.2 The HPCC Application Motivation for Java -- Meta-Challenges [SYRACUSE]
- [original text: 2.1syr) Introduction]
- First some preliminary remarks. We classify all (interesting) HPCC applications
as belonging primarily to one of two domains: Grand Challenges or National
Challenges -- the "HPCC-Challenges". Recognize that today's technology is
blurring the distinctions between parallel and distributed processing, e.g.,
the use of ATM-LANs by networks of workstations/PCs. The "traditional" high-performance
applications require finely tuned and optimized implementations of runtime
support (intrinsics and communications) on their host processors. The current
designer of runtime systems seeks to provide the application programmer
with easier access the high performance potential of a given machine(s)/
architecture. Beyond speed, however, National Challenges can pose additional
requirements such as trust or privacy, i.e., security, and here a runtime
system designer must also consider metrics like "response time" or "transactions
per second".
- 1.3 The Nature of Java Applications and their need for HPCC [TEXAS]
- [original text:1.1tex) The Nature of Java Applications]
- The specifications for parallel structuring in Java should be determined
by the requirements of the applications implemented in Java. While the full
spectrum of applications for which Java will be used is not well- characterized
we can infer something from its history and goals and from its current applications.
There are two characterizations of the nature of applications. One is "What
is the functionality of the applications?" and the other is "What is the
structure of the applications?"
- Java is primarily being used for construction of interactive interfaces
for human/machine, human/human and machine/machine interactions. (The current
applications of Java as an "applet" language are closely akin to its historical
origins as a language for implementing interactive interfaces for consumer
electronics.) The extensions this group is considering would extend the
range of interactions to include computational or data-intensive simulations
of complex systems. Some of the potentially most interesting applications
(simulations of large scale human-machine-terrain systems) will involve
coordination among several large scale computational or data- driven simulators
and and very large numbers of agents.
- Java programs typically implement a protocol for interactions among "intelligent"
entities or agents. These protocols are typically composed from existing
components. That is, Java programs are typically structures of invocations
of methods of objects where the pattern of invocations follows a stimulus-response
protocol.
- We conclude that the application requirements for parallel structuring specifications
for Java are largely for coordination and sequencing of interacting entities
(object or task parallelism). We can further conclude that the applications
involving high performance computation may involve many agents.
- [original text:,1.7ind) Current Strengths and Weaknesses of Java]
- In general it may be best to strengthen Java's existing parallel and distributed
computing mechanisms rather than impose data parallel constructs on the
language. In our opinion, the areas to strengthen are:
- a) the thread model. make sure it is possible to create lots of threads
efficiently (an array of thread class objects for example) and make sure
the Java abstract machine can handle them.
- b) object name spaces and remote objects. java needs a model that allows
an aplet to talk to an object running on another compute server. <more on
this in 7.2 below>
[also cf:sec 2.7 below]
1.4 Lessons from and Relation to HPF HPC++ Fortran-M and other HPCC Software
Systems [INDIANA]
[original text:3.1ind) Comparison with HPF]
- On the HPF side it is possible to create a class of distributed objects
that are similar to the HPF distributed array. Because it can be defined
as a class, there would be no need to create a special extension of the
array data type or an annotation to support distribution. this is what is
done in HPC++. In our opinion, the HPF model of data parallelism may not
be the most interesting thing to do with Java. If you want HPF features,
then use HPF.
[original text:3.2ind) Comparison with HPC++]
- Because Java is derived from a simple model of C++, some of the HPC++ ideas
could move to java. For example, annotations for parallel loops. Other ideas,
like the HPC++ parallel STL could only move to Java if
- 1) There was a template mechanism added to Java. this may not be a bad idea.
- 2) Because java has no pointers another mechanism must be found to replace
the STL concept of a parallel iterator. This might be a smart array index
object in Java.
[original text:3.3ind) Operator Overloading]
- Also java does not provide an overloaded operator mechanism, so building
a parallel array class is not as easy as in HPC++ (or even Fortran 90.)
[original text:4.4csc) Comparison with C++]
- In C++, two flavors of extensions appeared: small tweaks for sugaring the
syntax of parallel constructs, and deep changes for modifying the semantics
of the language to support concurrency. In HPJava, a limited preprocessor
which implements some shallow (sugaring) syntax changes may be a reasonable
project at a later stage, after the library approach has matured enough
to identify the most egregious kinds of syntactic clutter.
- 2: Further Details of Motivating Applications
- [If gets too long, details can go into appendices]
- 2.1 General Discussion of Meta-Challeges [SYRACUSE]
- [original text:2.2syr) Meta-Challenges]
- We define a new class of applications called "Meta-Challenges". These are
loosely coupled sets of physically distributed instances of Grand or National
Challenges. A DoD example of a Meta-Challenge is Distributed Interactive
Simulation (DIS). DIS involves the modeling of forces (tanks, aircraft,
satellites), environments, rules of engagement and tactics, etc. Simulation
control and monitoring for red, blue, and white players are provided. A
simulation may need to grow with the addition of "green", "yellow", or neutral
forces, and have to scale form peace-time/peace-keeping through low-intensity-conflict
(LIC), to massive conflict. Requirements for considering chemical, biological,
or nuclear (CBN) environments are needed, for example, modeling operations
in a nuclear environment (OPINE).
- 2.2 An Example -- Distributed Interactive Simulation [SYRACUSE]
- [original text:2.3syr) An Example Meta-Challenge: DIS]
- Java offers a means to enhance the existing DIS multi-user protocol. A new
DIS protocol, Advanced Distributed Simulation (ADS), is an attempt to introduce
scalability. ADS breaks apart the DIS into an active-reactive part and an
interactive-communicating part. Java is consistent with such an evolutionary
approach. Java applets could update models on-the-fly, as with dynamic weather,
battle damage, or threat-responsive counter-countermeasure techniques. One
of the greatest challenges to the DIS designer is finding ways to interconnect
independently built simulations having disparate modeling granularity (i.e.,
the ALS protocol, Aggregate Level Simulation), and then increase the number
of models and improve fidelity, but with no-greater demand on network bandwidth.
Weapon system design has also been evolving from integrating interoperable
subsystems to influence management among ubiquitous supersystems. (An example
of the latter would be Information Warfare (IW). Another is counter-OSINT
--open source intelligence). BM/C3 is undergoing changes in definition:
BM/C4 (computing), C3I with intelligence, and IS&R for surveillance and
reconnaissance. Also, a fifth-C, C5I is sometimes added for emphasizing
the collaboration among humans of any C3I system. (The relationship between
an Air Traffic Controller, the ATC surveillance, tracking, processing and
display system, and a pilot is another example). From a user's (Commander's)
perspective, Java is extremely attractive for simulating (or even building)
these systems since its roots are in the distributed, interactive, object-oriented,
collaborative user domain. It is rapidly becoming the language of choice
for programming the user-interface of on-line commercial multimedia systems.
- [original text:2.4syr) Java for DIS]
- Java is intended to be machine independent and offers to challenge the need
for hegemony among systems. For example, the ideal computing architecture
for simulating military battle management including command and control
is multiple instances of "loosely coupled sets of tightly coupled processors."
Such always poses a severe configuration management problem: multiple hardware,
operating system, and programming languages. Interoperability among systems
of systems and the interfacing between "stovepipe" systems is related challenge.
Java could provide an approach for "wrapping" existing or planned systems
with a functionality layer for interoperability while ameliorating the bandwidth
problem by intelligently transmitting only those parts of a message (applet)
that are immediately needed by an application (user).
- 2.3 An Example -- Manufacturing [SYRACUSE]
- [no text]
- 2.4 An Example -- Remote Sensing and Data Fusion [UMD]
- [original text:2.5umd) Remote sensing data fusion]
- There are many applications that are likely to require the generation of
data products derived from multiple sensor databases. Such databases could
include both static databases (such as databases that archive LANDSAT/AVHRR
images) and databases that are periodically updated (like weather map servers).
A client would direct (High Performance) Java programs to run on the database
servers. These HPJ programs would make use of the servers computational
and data resources to carry out partial processing of data products. Portions
of data processing that required the integration of data from several databases
would be carried out on the client or via a HPJ program running on another
computationally capable machine.
- 2.5 An Example -- Medical Interventional Informatics [UMD]
- [original text:2.6umd) Interventional Epidemiology/ Medical crisis management]
We are considering scenarios involving medical emergencies associated with
armed combat, terrorist actions, inadvertent releases of toxic chemicals,
or major industrial plant explosions. In such scenarios, the goal is to
rapidly acquire and analyze medical data as the crisis unfolds in order
to determine how best to allocate scarce medical resources. This scenario
involves the discovery and integration of multiple sources of data that
might be in different geographical locations. We would For instance, one
may need to determine the best diagnosis and treatment for patients exposed
to combinations of chemicals whose effects have not been well characterized.
This class of scenarios is motivated by the recent ARPA-sponsored National
Research Council workshop on crisis management.
This involves discovering and Exploiting Clinical Patterns in Patient Populations.
These scenarios involve study of health records to determine risk factors
associated with morbidity and mortality for specific subpopulations (e.g.
various substrata of armed forces personnel serving in a particular region).
In such scenarios, the goal is to provide high quality, lower cost care
by exploring diverse clinical data sources for patterns and correlations
and by ensuring that significant data for individual patients is not overlooked.
For instance, a system of this type could rapidly track the spread of serious
infectious diseases such as Hepatitis B or HIV. It could also be used to
identify specific soldiers whose demographic profile or past medical history
might place them at particularly high risk of contracting the disease.
The systems software required for these medical scenarios should be very
similar to the requirements motivated by the sensor data fusion scenario.
2.6 An Example -- Network-based data mining [ROCHESTER]
[new text from ROCHESTER:]
With large volumes of routine data having been collected, many organizations
are increasingly turning to the extraction of useful information from such
databases. Such high-level inference processes may provide information on
patterns from large databases of particular interest to defense analysis.
Data mining is an emerging research area, whose goal is to extract significant
patterns or interesting rules from such large databases. Data mining is
in fact a broad area which combines research in machine learning, statistics
and databases. It can be broadly classified into three main categories:
Classification -- finding rules that partition the database into disjoint
classes; Sequences -- extracting commonly occurring sequences in temporal
data; and Associations -- find the set of most commonly occurring groupings
of items.
At Rochester, we have worked on the problem of mining association rules
in parallel over basket data on both distributed shared-memory multiprocessors
and a network of machines. Basket data usually consists of a record per
customer with a transaction date, along with items bought by the customer.
Many real-world applications can be formulated in such a manner. Our preliminary
experimental results on a SGI Power Challenge shared-memory multiprocessor
machine and a DEC Memory Channel cluster are very encouraging.
The next step is to consider scenarios where the database server is accessible
via the Internet. The machine independent nature of Java makes it ideally
suited to writing both the server and client code. The server could be a
collection of machines, i.e., it could be a single tightly-coupled multi-processor
shared or distributed memory machine, it could be a network of workstations,
or any combination of these with heterogeneous processing elements, and
heterogeneous interconnection network. The client side could be a heterogeneous
collection of machines as well. We need compiler and run-time support to
manage both the client and the server. The compiler should be able to insert
code to exploit the local network configurations on both ends, while the
run-time system monitors dynamic behavior and executes different actions
according to the transient information. We have implemented a client-server
data mining algorithm in Java. The preliminary results will be discussed
in more detail in Section 8.5.
Data mining algorithms tend to have program behaviors quite different from
traditional Grand Challenge problems in terms of both data access patterns
and control flow. Unlike simple array structures used in scientific code,
the data structures used in mining algorithms include hash trees, hash tables,
and linked lists. Consider mining for associations: the first step consists
of finding all the sets of items (called itemsets), that occur in the database
with a certain user-specified frequency (called minimum support). Such itemsets
are called large itemsets. An itemset of k items is called a k-itemset.
The general structure of algorithms for mining associations are as follows.
The initial pass over the database counts the support for all 1-itemsets.
The large 1-itemsets are used to generate candidate 2-itemsets. The database
is scanned again to obtain occurrence counts for the candidates, and the
large 2-itemsets are selected for the next pass. This iterative process
is repeated for k = 3,4, ... till there are no more large k-itemsets to
be found. Effective parallelization and networked computing require new
research in algorithms, compilers and runtime systems.
2.7 Java based Web Server and Client Agents [SYRACUSE]
[An example of conventional Java application/applet needing HPCC -- filtering
VRML for Visible Human database is one possibility -- cf: sec. 1.3]
- 3: Models for the HPJava Programming Environment
- 3.1 General Philosophy and Discussion of Possibilities [TEXAS]
- [original text:1.2tex) Consistency with Current Language Philosophy and Feature Set]
- The second major determining factor in defining a model of parallel computation
for Java is consistency with its basic philosophy and its current feature
set.
- The second major determining factor in defining a model of parallel computation
for Java is consistency with its basic philosophy and its current feature
set.
- Java is primarily a compositional and coordination language. That is, the
main thrust of the feature set is not computations or data transformations
but rather composition of coherent program structures and well-ordered traversal
of those structures.
- Minimality and simplicity of representation is another major element of
the Java philosophy.
- Java's object-oriented features such as inheritance also place some constraints
on possible models of parallel computation.
[original text:1.3tex) Inferences]
- a) Large scale parallel computations will not be programmed directly in
Java. Java will incorporate modules which implement HPC-based simulations
of physical systems but these modules will be programmed in other languages
and linked together with Java programs. Java programs may, however, need
to interact with these modules on a fairly fine-grained level.
- b) The model of parallel computation should focus on specification of interactions
among modules and should not impede composition of program structures.
- c) The parallel structuring capabilities should be scalable.
[original text:4.2ind) Implications of Java Philosophy]
- One of the principles in the design of Java has been simplicity, Rajeev
feels that language annotations may go against the original philosophy.
Gannon is not sure.
[cf: sec 1.3 above]
3.2 Object and Applet Parallelism [TEXAS]
[original text:1.2tex) Consistency with Current Language Philosophy and Feature set-- cf: sec 3.1]
[original text:1.3tex) Inferences-- cf: sec 3.1]
[original text:7.4.6ind) Applet view of HPCC Applications]
- A HPCC application on a large MPP server or distributed over multiple servers
could contain access/control and modify methods.
- To access the application is to download a part of the application as an
applet.
3.3 Task and Control Parallelism [INDIANA]
[original text:7.3.1ind) Limited Synchronization Capabilites of current Java]
Following the exact semantics of Fortran-M channels might be a challenge
in java because of the limited synchronization features of java
[My impression is that the synchronisation facilities in Java are reasonably
complete--once you track them down. DBC]
3.4 Data Parallelism [UMD]
[original text:1.6tex) What should NOT be done]
- There is no purpose in creating a data parallel capability for Java. Leave
that for the linked modules.
- There is no purpose in embedded messages or locks or other communication
and/or synchronization mechanisms which will interfere with composition,
inheritance and reuse.
[original text:1.6ind) Is Data Parallelism relevant for Java]
- In general, we feel that adding data parallel constructs to Java may be
counter productive. It is more important that Java be able to communicate
with object that run on MPP platforms and are implemented in HPF or HPC++.
This implies that we want a mechanism to "wrap up" data parallel application
into object classes that implement a Java interface. Once this is done,
Java can treat the object like any other Object.
- 4: Possible Java Language Enhancements and Support Classes for HPCC
- 4.1 Current Java Features for Concurrent and Distributed Computing [CSC]
- [original text:1.4tex) Current Features for Specification of Parallel Execution Behaviour]
- Java currently has explicit threads of control and the "synchronized" property
for methods as its means of specification of concurrent behavior. Attaching
the "synchronized attribute to a method cause any execution of a method
on an object instance to be blocked if another method is active against
the instance. This is equivalent to a "lock" on the instance.
- Use of threads without adequate control over scheduling and sequencing is
fraught with potential for inconsistency.
- The "synchronized" attribute for modules can lead to deadlock and induces
the "inheritance anomaly." The inheritance anomaly occurs when an inheritance
hierarchy which is valid under sequential execution becomes invalid under
parallel execution. (Note that under sequential execution methods interact
only at the times of invocation and termination while under sequential execution
methods may interact at any location in a program where a direct or indirect
reference to a shared or potentially shared resource occurs.) The inheritance
anomaly occurs when a redefinition of a computational specification in a
method leads to a change in its interactions with other methods under parallel
execution. It is easy to generate examples of the inheritance anomaly in
Java.
- PREDICTION - Any extensive use of the current Java with any degree of complex
parallel structuring will lead to non-trivial problems. The current mechanisms
are not scalable even without concern for integration of HPC- based modules.
[original text:1.4ind) Current Features of Java]
- Transparency being one of the key factors behind Java, any model of parallelism
should not pose that great of a problem to Java.
- That given, as you all know, there is already a model of parallelism based
on the thread class library. However, this is based on a shared object name
space model. It is not yet clear if any real multiprocessors allow true
multithreaded implementations of the java abstract machine.
[original text:1.5ind) Extended Synchronization in Java]
- Synchronization is based on the use of "synchronized" methods. For large
scale parallel applications it *may* be necessary to add more synchronization
types
[original text:1.7ind) Current Strengths and Weaknesses of Java -- cf: sec 1.3]
4.2 Language Enhancements [CSC]
[original text:1.5tex) A Proposal for Additional Specification of Parallelism in Java]
- a) Add an "Interactions" section to the definition of a class. The "interactions"
specifications will give the conditions under which a method can be invoked
or should not be invoked. This can be viewed as an extension of the "synchronized"
method attribute. The interactions could be specified in terms of the data
state (guard) or execution state (event expression).
- b) Extend the Interface Specifications to include any interactions which
can take place under parallel execution, i.e. access to shared data or shared
resources during method execution.
- c) Consider a dynamic model of communication. This topic will be discussed
in a separate position paper.
- These extensions are fully consistent with the philosophy and features of
Java but add greatly to its power for coordinating parallel interactions
without interfering with composition and reuse.
- This proposal is intended as a thought piece. The details may differ but
the principles are sound.
[original text:1.5ind) Extended Synchronization in Java -- cf: sec 4.1]
[original text:4.2csc) Extensions to the Java language ]
- Three categories:
- (a) syntactic sugar for obvious dataparallel constructs,
- (b) annotations to flag concurrency and suggest distribution strategies,
and
- (c) deep language additions to extend Java's support for explicitly managed
concurrency.
- These three alternatives generally mirror the history of proposed C++ language
extensions.
- Pros: Language extensions give designers more freedom to set "optimal syntax"
for new parallel constructs, and enable some optimizations for the implementation
of those constructs.
- Cons: Obviously, the need to distribute a HPJava compiler, and the obligation
to make that compiler (and hopefully its associated debugger) track the
rapid evolution of the Java language and the Java virtual machine. It may
not be too difficult to come up with a prototype. But on the Web, people
expect more than research prototypes. Users will demand assurances (whether
reasonable or not) that the additional HPJava compilation process will be
at the same level of production-quality as the underlying Java compiler,
and that the new HPJava features will not degrade Java's security or restrict
the portability of their code.
[original text:4.3csc) Implications of Immature Language]
- Because the Java language will continue to evolve, any preprocessor or compiler
approach will have to track Java's evolution. There's a danger that a real
and robust compiler cannot be delivered quickly, and that by the time HPJava
extensions are defined, implemented in a preprocessor, and distributed,
those tools will be at least one generation out of sync with the latest
Java standards. Debugger support for Java is already fairly poor, even compared
with existing HPC languages, so adding nonstandard extensions can only make
the situation worse by confounding Java source/object code correspondence.
[original text:4.2ind) Implications of Java Philosophy -- cf: sec 3.1]
4.3 Class and Interface Support [CSC]
[original text:4.1csc) Extensions via class libraries]
- Use Java's OO features to add new functionality in the form of redistributable
Java classes.
- Pros: Most significantly, no need to distribute a new Java compiler or preprocessor,
and no need to modify the "standard" process for compiling and linking Java
code. Because Java is class-extensible by design, library-based approaches
shorten the development time for Java extensions.
- Cons: Java lacks operator overloading (but history may ultimately view this as
a gain for readability) and templates. As a result, library-based approaches
may result in inherently cluttered code (no sugared syntax for the obvious
data-parallel constructs). Lack of templates may make it hard to define
collective classes with polymorphic base types, but the fact that all Java
objects derive from Object may still make possible polymorphic approaches
with reasonable overhead. Library-based extensions may defer some program
analysis to runtime which could have been performed once at compile time.
[original text:4.5ind) High Performance Classes]
- As there are lots of standard classes (java.net, java.awt, java.util, etc.),
it is very intuitive to create high performance classes which can be precompiled
and imported in the user programs. Also, users can extend these standard
high performance classes to suit their needs.
- Various data distribution schemes (supported by HPF) can be implemented
as part of these high performance classes.
- 5. Compilation/Interpretation Issues
- 5.1. Compiled vs. interpreted [RICE]
- [original text:6.1rice) Compiled vs. Interpreted]
- [original text:6.1ind) Performance Issues]
- Performance is the issue and hence, shouldn't client as well as server have
efficient compilers?
- Is there a special reason for considering an interpreter on the client side?
[original text:5.1roc) Generate native code for shared-memory architectures]
- Parallelization of Java programs on shared-memory parallel architectures.
- The first step is to generate parallel code for shared-memory multiprocessor
workstations.
- The second step is to compile for Shared Virtual Memory systems like Cashmere
being developed at Rochester.
5.2. Compilation issues [RICE and ROCHESTER]
[original text:6.2rice) Compilation issues]
- Analysis Techniques
- Analysis Techniques
- On-the-fly Compilation
- Back-end Optimizations
- Register Allocation
- Scheduling
- Optimization vs. Garbage Collection
- Preserving Security
[original text:5.2roc) Java vs Bytecode -- omitted]
[new text from ROCHESTER:]
The current Java compiler takes the hybrid approach of compilation and interpretation.
At Rochester, we would like to exploit more options to generate efficient
code for High Performance Java applications. We consider three scenarios:
- (i) Compiling Java to bytecode.
- In this approach HPCC Java could be interpreted. The compiler would generate
Java VM code suitable for HPCC problems. In addition to HPCC-specific optimizations,
this may require generating (or applying user-provided) annotations. There
is no consensus yet, whether using HPCC-specific annotations is desirable.
- (ii) Compiling Java to native code.
- Here, we would treat Java programs in the same way as we treat programs
in other languages (C++, Fortran). Due to the dynamic nature of Java, different
approaches are possible when all source files are available and when we
compile one class at a time. In the first case we may be able to eliminate
some overheads (e.g., type checks, array range checks, virtual method invocations).
In the latter case, only local optimizations would be possible.
- This scenario gives us the most flexibility, but is not acceptable in some
circumstances. In "network computing" we want applications to be distributed
in the bytecode format. In that case we have to use approach (i) or (iii)
or a combination of both. An orthogonal issue is how to handle java class
libraries, which (if any) are implemented implicitly in the interpreter,
these will probably have to be implemented in some HLL or directly in native
code. This is also an issue in scenario (iii) below.
- (iii) Compiling bytecode to native code.
- This seems to promise a lot of power. We expect that interpreting the bytecode
(even with quick variants of VM instructions or localized just-in-time compilation)
will be inherently slower than running native code. Our early experiments
with data mining in Java have confirmed this observation (cf. Sec. 8.5).
This is especially true if we can compile a large part of the application
(ideally, the whole application) at the same time. Then, we can apply optimizations
like inlining or compile-time type checking which otherwise would be available
only in scenario (ii).
- A variant of this technique would transform bytecode to bytecode with the
addition of quick, native methods for common operations and native implementation
of common data structures (cf. Sec. 7.1). Such data structures may include
arrays or lists. Such a data structure could be implemented with the internal
representation which could be used directly by native methods implementing
most common operations. An interface would be provided to access the data
structure from Java.
At Rochester, we are implementing compiler infrastructure with multiple
front- and back-ends which will uniformly treat all scenarios mentioned
above (cf. Section 8.4). We expect to include traditional optimizations
like cache optimizations, register allocations etc.
5.3. Interaction with runtime [RICE]
[original text:6.3rice) Interaction with runtime]
- Function Dispatch
- Function Dispatch
- Coexistence with Interpreted Code
- Method Caching
- Dynamic Updating of Applications
- Source-to-Source transformation
5.4. Tools interface [RICE]
[original text:6.4rice) Tools Interface]
- 6. Runtime Support
- 6.1. Dynamic communication and distributed nameservers [TEXAS]
- [original text:7.1.1tex) Communication Requirements of Java Applications]
- The requirements for communication mechanisms in Java should the driven
by the applications of Java. The general nature of the applications of Java
are discusses in the position paper "A Model of Parallel Computation for
Java." The point emphasized herein is that many HPC- based Java applications
will involve large dynamic sets of interacting agents.
- Therefore, any extension of Java for support of HPC-integrated applications
should endeavor to make communication among large dynamic sets easy and
convenient.
[original text:7.1.2tex) The Current Model of Communication]
- The model of communication for Java is implicitly defined. It is based on
assigning a unique name to every communicating agent and/or every object
used in communication.
- The benefits of using this model of communication are that is familiar and
that there is a large technology for support of implementation of this model
of communication.
[original text:7.1.3tex) Desirable Properties of Model of Communication for Java]
- a) It must be simple and uniform.
- b) It should give convenient support for communication among dynamically
constituted sets of agents.
- c) It should support continuous dynamic mapping of Java implemented agents
and objects to dynamic resource configurations.
- d) It must be consistent with the philosophy and current features of Java
and implementable in a consistent syntax and semantics
[original text:7.1.4tex) Proposed Model of Communication]
- The proposed model of communication is state-based communication. The elements
are as follows.
- a) Each class will have another declaration called a "profile" declaration.
The profile declaration will define the set of communications which an instance
of the class is prepared to accept at a given time.
- Profiles are dynamic and reflect the state of each object at a given time.
The profile declaration includes a method for initializating and modifying
profiles.
- b) Each set oriented communication will have two parts; a selector and some
data. A selector is a template for a predicate over state variables which
will occur in profiles.
- c) A communication is initiated by an object creating a communications object
which includes a selector and the associated data. Creation of the communication
object invokes an associative broadcast of the communication object. An
associative broadcast determines all of the object profiles which match
the selector and delivers the data to them.
- A detailed definition of associative broadcast can be found in Bayerdorffer
[BAY94]
[original text:7.1.5tex) Integration/ Implementation Concepts]
- This model of communication can readily be integrated into the concept framework
for system services provided by Java. One possible integration follows.
- a) This model of communication can be implemented in Java using the "implements"
attribute in the same mode as is done for threads wit the class "Runnable."
- b) The interpretive model of execution allows for ready mapping of communication
objects to mechanisms appropriate to local or remote communication.
[original text:7.1.6tex) Comments]
- a) This model of communication is familiar. It is run-time implementation
of a name server. Linda [GER86] uses a model of communication in the same
class of capabilities but with a different set of implementation concepts.
- b) Distributed resource management algorithms such as fault recovery, reconfiguration,
addition and deletion of system resources are much simpler in a state-based
model of communication than in a unique name- based model of communication.
6.2. Parallel and conventional CORBA interface to Java [INDIANA]
[original text:7.2.1ind) CORBA Model]
- CORBA uses the following model. a client program (read applet here) requests
a "pointer" to a remote object by making the request to an "object request
broker" (ORB). The ORB locates the object and returns a "pointer" (in java
this implemented as a reference to a local "proxy" (object) for the remote
object. any request made to the proxy is caried out by the remote object.)
- All objects in CORBA have interfaces that are defined in terms of IDL (interface
definition language). these interfaces are very similar to the "Interface"s
in Java.
[original text:7.2.2ind) Current Actitivities]
- Rajeev and Gannon have been investigating this for about a month now.
- There are already to very interesting efforts to build a link between Java
and CORBA, the industry standard for distributed object management. One
is JIDL from Sandia and the other is HORB from ETL in Japan. You should
all look at both of these.
- JIDL => see this => http://herzberg.ca.sandia.gov/jidl/
- HORB => see this => http://ring.etl.go.jp/openlab/horb/.
- the difference between JIDL and HORB can be characterized by the the way
the IDL mechanism works.In HORB java object talk to other java objects by
using the java interface as the IDL. In JIDL, java applets talk directly
to a real CORBA ORB.
[original text:7.2.3ind) HPJava Issues]
- For HPJava we see several important issues:
- a) Alink with standard CORBA objects is essential.
- That is, A CORBA object on the network would be accessed by andmanipulated
by a local java proxy object in the applet.
- If this is the case, we need to translate from Corba IDL to java. there
are some technical issue to solve:
- how could the java bytecode interact with formats used in CORBA implementations?
- Indeed are all the IDL types expressible in java? for example arrays and
streams?
- b) how can we design an ORB interface that satisfied the java security model? (more on that later.)
6.3. Low level runtime issues -- PCRC Distributed Memory [SYRACUSE with Ranka at FLORIDA]
[original text:8.1.1syr) Introduction]
We consider using Java to also provide low-level runtime support for the
Meta-Challenges and distributed Interactive Simulations (DIS) discussed
above as well as the traditional Grand and National Challenges: coding the
high-performance runtime system in Java (i.e., traditional PCRC-like, including
Parti/Chaos). There are two basic sets of issues that a runtime system designer
must consider for traditional HPC challenges:
- ** the interface with the application language, e.g., Fortran with message
passing (HPF) or C/C++ with message passing (HPC/C++) Q note: the application
could be coded in Java (or VRML)
- ** the interface with the underlying process control and communications
library, e.g., PVM or MPI.
[original text:8.1.2syr) Features for PCRC Runtime in Java]
The Syracuse University Fortran 90D runtime consists of about 300 routines/functions.
We expect that number to reduce to less than 150 in the newer PCRC implementation
by for handling regular computations and communications. We add to this
the estimate about 60 functions from the Maryland CHAOS runtime for handling
the irregular case. At present our Fortran 90D runtime (including its use
of Parti/Chaos) requires only 23 different MPI functions. The same will
hold for the newer PCRC version. We claim the following:
- ** All address translation routines can be implemented in Java (since they
usually do not involve communication)
- ** All data movement routines can be implemented in Java, provided we establish
a smooth interface between Java and underlying communication system (e.g.,
MPI)
- ** At present, certain computational functions may have higher performance
in other languages: e.g., MATMUL, initially may be left in Fortran, since
Fortran still has better number-crunching performance, and more easily handles
situations like COMPLEX data types.
[original text:8.1.3syr) Issues for PCRC Runtime in Java]
As we re-implement the PCRC runtime system for regular and irregular distributed
arrays, we must address the following design issues:
- ** Java currently provides facilities for concurrent processing via threads
and monitors. Should we directly extend JavaUs (native) functionally to
include more parallel-processing support ?
- ** The interface between Java and low level communication systems must be
better understood and benchmarked. For instance, Java now can call a C function
when that function has been specially written to be called by Java. We need
to experiment and evaluate how the Java runtime environment (java) can collaborate
with the MPI runtime environment (mpirun).
- ** The PCRC runtime has deeper roots with "true" compiled languages vice
the interpreted ones. We prototyped the Fortran 90D Interpreter (demonstrated
at SC 94) after developing the Fortran 90D compiler and runtime. Thus we
believe experimentation with prototype interfaces between a PCRC runtime
system (for regular and irregular distributed arrays) coded in Java and
the Java native runtime for interpreted environments are needed.
[original text:8.2.4ind) Java Distributed Memory Model]
- As mentioned above we can do HPF style partitioned arrays, but a more interesting
thing from the pcrc view might be the metachaos shared objects. we have
been looking at this from the corba side and realized that to do it right
requres a substantial extension of the corba semantics. Kate Kaehey at Indiana
has worked this out as part of the HPC++ CORBA extensions (she will present
this at POOMA next week.)
- From the Java perspective here is the model used in the HPC++ CORBA extensions.
(the maryland gang can comment on "true" metachaos. this is only how we
have come to view the subject.)
6.4 Low level runtime issues -- Threads [ROCHESTER]
[original text:8.2.1roc) How to implement the thread packages on which Java is based across
a range of architectures]
[original text:8.2.2roc) How to unifiy shared-memory and message-passing portions of web-spanning
applications.]
[new text from ROCHESTER:]
Threads in Java are supported for concurrency. This is useful as perceived
by a user, when response time is important, e.g. the HotJava browser where
you can scroll a page while down-loading an image. It currently does not
support threads running on separate processors. Our main objective in this
section is to support parallel threads, at no expense to the Java programmer.
We would prefer not to change, modify or limit the language as perceived
by the user.
There seem to be two possible scenarios:
- 1) Replace the current Java thread package: implement a completely new native threads package for each architecture
to be supported. Use this package instead of the Java thread class.
- The advantage of this approach is that no special compiler support is required.
It is less susceptible to changes in the language. The disadvantage is that
this might not be possible. If the interpreter actually implicitly "implements"
threads where it just branches from one context to another on a yield or
a scheduling call, then replacement might not be feasible. This obviously
requires a more complete knowledge of how the interpreter behaves w.r.t
thread library calls.
- 2) Add a new parallel thread package: implement a parallel thread package for each architecture that is independent
of the Java thread class. Maintaining the original classes may be useful
for relatively cheap multi-threading on a single processor and also we may
not have a choice!
- The advantage is that we may be able to implement the parallel threads mechanism
independent of the interpreter to the extent that a Java compiler can insert
calls to the parallel threads package directly when appropriate. The disadvantage
is that the compiler has to do more work distinguishing between parallel
threads and normal Java threads and when it is legal to replace the latter
with the former.
At Rochester, we are particularly interested in implementing a parallel
thread package for Java with special compiler support on distributed shared-memory
(DSM) machines. On scalable multiprocessors, communication is much more
expensive than computation. Many different protocols have been proposed
for behavior-driven locality management. None of them work best for all
types of data sharing. Performance would be increased if the system could
adapt to the program's memory access patterns. An example could be the use
of multithreading to alleviate memory latency (especially on remote accesses)
aggravated by delays introduced by the memory coherency protocol. We propose
to analyze programs at compile time and annotate them with data access summary
information. Exploiting this information will require the ability to switch
among multiple data coherence protocols at run time.
[original text:8.2.3ind) Performance of Current Java Threads]
- This is a clear win for java. The question is if there is an "enhanced"
thread interface that is better than the current "runnable" in Java. We
will do some SMP experiments to measure the current java runtime on SMP
systems. We have both the IBM and SGI versions.
- 7. Architecture Issues for Integration of Java with HPCC
- 7.1. Shared Memory Architecture Issues [ROCHESTER]
- [original text:5.1roc) Generate native code for shared-memory architectures -- cf: sec 5.1]
- [original text:5.2roc) Java vs Bytecode -- cf: sec 5.2]
- [original text:5.3roc) Locality optimizations for (distributed) shared-memory architectures]
- Interesting optimizations are:
- (a) data mapping and transformations,
- (b) cache optimizations, and
- (c) register allocations. Java provides new opportunities and challenges
that are different from other languages like Fortran.
[original text:5.4roc) How to integrate shared-memory environments across a heterogenous
collection of machines and interconnects, respecting dramatically different
latencies and bandwidths]
- How to compile for hierarchical shared-memory systems composed of cache-coherent
MPs at the top, remote-memory networks in the middle, and message-passing
networks at the bottom.
7.2. Heterogeneity [INDIANA]
[original text:5.5roc) Compiling Java for heterogenous networks of machines]
- There is heterogeneity in various aspects of parallel programming: program,
processor, memory and network. A heterogeneous program has parallel loops
with different amount of work in each iteration; heterogeneous processors
have different speeds; heterogeneous memory refers to the different amount
of user-available memory on the machines; and a heterogeneous network has
different cost of communication between processors. We need both compiler
and runtime support.
[original text:5.6ind) General Issues]
- Interoperability/transparency is a strong point behind design of Java (creation
of bytecode, etc), HP Java should well suit any target architecture.
- Different precompiled classes could be a way to tackle this.
7.3. Fault-tolerance and Reconfigurability [TEXAS]
[original text:7.1.6tex) Comments -- cf: sec 6.1]
7.4 Security [INDIANA]
[original text:7.2.3ind) HPJava Issues -- cf: sec 6.2]
- 8. Some Early Experiements.
- Details could go into appendices
- 8.1. At Maryland -- Maryland Responsible for Name(s) [UMD]
- [original text:7.4.1umd) Mobile Programs]
The intent is to provide a single unified framework for remote access and
remote execution in the form of itinerant programs. Itinerant programs can
execute on and move between the user's machine, the servers providing the
datasets to be processed or third-party compute servers available on the
network. The motion is not just client-server; it can also be between servers.
Programs move either because the current platform does not have adequate
resources or because the cost of computation on the current platform or
at the current location in the network is too high. Various parts of a single
program can be executed at different locations, depending upon cost and
availability of resources as well as user direction. The architecture would
also provide support for plaza servers. Plaza servers provide facilities
for (1) execution of itinerant programs, (2) storage of intermediate data,
(3) monitoring cost, usage and availability of local and remote resources
on behalf of itinerant programs and alerting them if these quantities cross
user-specified bounds, (4) routing messages to and from itinerant programs,
and (5) value-added access to other servers available on the network. Examples
of value-added functionality include conversions to and from standard formats
and operations on data in its native format. In addition to allowing cost
and performance optimization, itinerant programs and plaza servers also
provide a flexible framework for extending the interface of servers, in
particular legacy servers with large volumes of data as well as servers
that are not under the control of the user.
[original text:7.4.2umd) Coupling sequential Java Programs]
MPI should be bound to Java so that Java programs can communicate by message
passing. We believe that applications will require an ability to process
periodically generated data. Programming to carry this out could be written
in MPI, however, a higher level library might prove to be useful.
Consider long-running itinerant programs that process periodically generated
data; each program processes sequences of data from multiple sources with
possibly different periodicity. An itinerant program can either visit each
of the data sources in order or it can install a surrogate at each data
source, which is activated every time the dataset is updated. A surrogate
acts as a filter for the sequence of data generated by the data source and
communicates appropriate updates to the parent itinerant program (possibly
after preliminary processing). For complex processing on a number of datasets,
a hierarchy of surrogates and result combining programs can be created.
The result of such itinerant programs can either be a single accumulated
result or a sequence of results. What we have in mind is a scheme that is
an extension of our existing scheme for mapping and synchronizing multiple
update sequences, which has been used in our program coupling effort. This
scheme has been used to dynamically link multiple physical simulations being
performed concurrently.
[original text:7.4.3umd) Coupling HPH programs with one another and with other data parallel
programs (e.g. MPI, HPF, HPC++)]
We consider the problem of efficiently coupling multiple data-parallel programs
at runtime. We propose an approach that establishes mappings between data
structures in different data-parallel programs and implements a user-specified
consistency model. Mappings are established at runtime and can be added
and deleted while the programs being coupled are in execution. Mappings,
or the identity of the processors involved, do not have to be known at compile-time
or even link-time. Programs can be made to interact with different granularities
of interaction without requiring any re-coding. A-priori knowledge of consistency
requirements allows buffering of data as well as concurrent execution of
the coupled applications. Efficient data movement is achieved by pre-computing
an optimized schedule. (This actually is already PCRC work and will appear
in a paper at the ICS conference in May).
8.2. At Syracuse -- Syracuse Responsible for Section Name(s) [SYRACUSE]
[original text:7.4.4syr) WebWork: Integrated Programming Environment Tools for National
and Grand Challenges]
is our joint work with Boston University and Cooperating systems with paper
written before implications of Java as clear as they are now!
[original text:7.4.5syr) Computational Web Multiserver - a proposal for Web based HPCC]
updates some of the ideas in WebWork
- Web technology development so far focused on the client side (Netscape2,
plugins, Java applets, JavaScript)
- Web servers, so far minimal (e.g. NCSA httpd), are expected to expand now
with the growing demand on backend processing (database, collaboratory,
imaging, compression, rendering, simulation, televirtuality).
- Java offers a promising development platform for computationally extended
Web servers. Current activities include:
- Java is also an attractive platform for PCRC module packaging and unification
of HP languages. Advantages of Java based HPruntime include:
- * secure, architecture neutral, OO but simpler than C++ framework
- * runtime modules dynamically downloadable (via portable opcodes)
- * native class support for performance critical (data parallel) ops
- * built-in OO support for networking and multithreading
- * Java compiler written in Java => portable HP extensibility
- Computational Web server = Java based Web server with native HPCC kernel
support (e.g. for matrix operations), extensible by PCRC modules
- * HPCC builds on pervasive Web technologies
- * Web browser based interactive insight into HPCC runtime
- Higher, fully interpreted layer with intuitive scripting syntax required
for rapid prototyping and high level runtime communication. JavaScript/LiveWire
(now formalized/standardized by W3C) is one possible candidate.
- However, more attractive approach is to develop high level VM on top of
low level Java VM to support variety of interpreted protocols. (One existing
example: postfix syntax based low Telescript layer). We call it WebScript
and view JavaScript, LiveWire, VRML as specific instances/little languages.
- Computational Web Multiserver Architecture:
- * runtime given by a mesh of computational web servers
- * several communication layers possible (HTTP, Java opcode passing, Java
class passing, E-object message passing, WebScript passing)
- * applications written in conventional languages (HPF/C++) or HPJava, HPJavaScript
(HPWebScript)
- * base runtime support server-resident, application-specific runtime libraries
downloaded dynamically and shared
- Summary of software layers:
- a) Java VM
- b) base Java classes (as in JDK 1.0)
- c) native HPCC classes (compiler directive handlers, matrix algebra)
- d) PCRC classes (Java or Java wrappers)
- e) WebScript VM (in Java)
- f) HP-WebScript instances (e.g. HP-JavaScript)
- g) applications
- At NPAC, we developed early protoypes of some of these concepts (with ARPA
support) and demonstrated at SC'93. MOVIE system was an early example of
computational "web" multiserver ("web" didn't exist before '93..). MOVIE
+ F90D based HPFI was an early example of high performance interpreted computational
multiserver. More specifically, the layers a) - g) above were implemented
as follows:
- a) MOVIE/TCE kernel (C with object-based organization)
- b) built-in (C coded) objects (string, dictionary, hashtable etc.)
- c) matrix algebra objects (fields = polymorphic n-dim arrays)
- d) F90D runtime, wrapped and linked to MOVIE via UNIX shared mem
- e) MovieScript VM (part of MOVIE kernel)
- f) HPMS (High Performance MovieScript = MovieScript + interpreted F90)
- g) HPF test applications, converted by modified F90D frontend to HPMS
8.3. At Indiana -- Indiana Responsible for Name
[original text:8.2.5ind) Implications of Current Indiana Research on HPC++ and CORBA]
- Kaehey's Collective Shared Objects are objects that live on the net. they
can be distributed or localized on a particular server. These object can
have data members which may also be distributed or localized and they have
two types of member functions: regular and collective. regular ones are
norman member functions like standard java. collective member functions
implment collective opeartions such as reductions, scans, broadcasts between
participating clients.
- A client applet must "register" with such an object before it would be able
to talk to do collective member operations. It turns out that this is an
interesting way to model distributed data bases, metachaos style shared
arrays. it was the essential component of our I-way project for sc 95.
[new text from ROCHESTER:]
8.4 (Rochester) Early Experiments -- Java Compiler Infrastructure [ROCHESTER]
As part of our effort to better understand the compilation issues, at Rochester
we are developing a Java compiler which could be used to optimize Java programs.
The compiler would use a common intermediate representation and a collection
of front- and back-ends to facilitate re-use of components and independence
of optimization passes.
There are two possible front-ends: one would read Java source, the other
one would read bytecode.
A set of back-ends would include Java source generation, bytecode generation,
and "native" code generation (either in form of some target HLL like C,
or actual native code for some architectures).
/=====> Java
Java ----\ /
======> IR =====-------> bytecode
bytecode ====/ \
\-----> C (or other "native")
Our current development concentrates on a bytecode front-end and Java back-end
(shown above with double lines). The choice was made, because using bytecode
as an input is more general (if we have access to source, we also have access
to its bytecode form -- the converse is not true). Using Java as output
has a big advantage in the development phase.
The implementation is not complete: only a subset of legal bytecode constructs
is implemented and the definition of our IR is not stable. However, preliminary
results are promising. We are able to identify implicit high level constructs
such as loops with good success. Also our high level IR makes implementation
of code transformation relatively simple.
Code instrumentation is our first application of the compiler framework.
We are developing a pass which inserts instrumentation calls. Currently
we are instrumenting loops, if-then-else statements, and methods. We are
planning to look into the performance of a range of Java applications. The
objective is to determine if there are any performance issues that are unique
to Java the language (the fact that it is interpreted), and/or applications
written in Java (it could be the case that only certain kinds of applications
are worth writing in Java).
An Example.
We are given a bytecode representation of a method corresponding to the
following disassembled code.
Method void paintAll(java.awt.Graphics)
0 iconst_0
1 istore_2
2 goto 19
5 aload_0
6 aload_0
7 getfield #7 <Field PCRCexample.CellList [I>
10 iload_2
11 iaload
12 aload_1
13 invokevirtual #6 <Method PCRCexample.plotCell(ILjava/awt/Graphics;)V>
16 iinc 2 1
19 iload_2
20 aload_0
21 getfield #4 <Field PCRCexample.n I>
24 if_icmplt 5
27 aload_0
28 aload_1
29 invokevirtual #3 <Method PCRCexample.updateStatusLine(Ljava/awt/Graphics;)V
>
32 return
We can parse the class file and recover a high-level intermediate representation
which corresponds to the following source (this was generated automatically
by our Java-source back-end).
public void paintAll (
java.awt.Graphics a1
)
{
int v0;
v0 = 0;
while((v0 < this.n)) {
this.plotCell(this.CellList[v0], a1);
v0 = (v0 + 1);
} //while
this.updateStatusLine(a1);
return;
} // paintAll
Since our IR explicitly identifies high-level constructs (e.g.,loops), it
is easy to instrument the code and/or perform optimizing transformations.
Section 8.5 (Rochester) Early Experiments -- Network-based Data Mining [ROCHESTER]
We have implemented a preliminary version of network-based data mining at
Rochester. Currently, the server is a passive entity in terms of the data
mining computation. It's only job is to accept client connections and ship
off requested data. The mining process is carried out at the client end.
Both the client and server are currently sequential.
We now present some timing results vis-a-vis a C++ implementation of the
algorithm which doesn't transmit any data over the network. Below we show
two synthetic databases from IBM we looked at:
Database Num_Trans Avg_Trans_Size Total_Size
-----------------------------------------------------------
T5.I2.D100K 100,000 5 2.615 Mbytes
T10.I4.D100K 100,000 10 4.31 Mbytes
These database are accessible only to the server through a
file-server.
We now present the total execution time for the algorithm. The
Java_Time includes the server disk-read time, transmission time and
client computation time. C++_Time only gives the disk-read time and
computation time.
Database Total_Size Java_Time C++_Time
--------------------------------------------------
T5.I2.D100K 2.615 Mbytes 367.7s 15.3s
T10.I4.D100K 4.31 Mbytes 1847.2s 78.9s
The break-up of the execution time is as follows:
Database Total_Size Java:Reading Computation C++:Reading Computation
------------------------------------------------------------------------
T5.I2.D100K 2.615 Mbytes 292.6s 75.1s 8.8s 6.5s
T10.I4.D100K 4.31 Mbytes 478.4s 1368.8s 13.8s 65.1s
In Java the net disk-read time and shipping time for data is at the rate of
9Kb/sec vs. 0.31Mb/sec disc read time in C++.
From the experiments, we make two observations:
- (1) interpretation is too slow for computationally intensive applications
(in the case of database T10.I4.D100K, 1368s vs 65s);
- (2) we should pay the network communication cost only when the server is
too overloaded to perform the computation.
We will address these issues in our future research.