Forwarded: Thu, 02 May 1996 20:08:41 -0400
Forwarded: leskiwd@npac.syr.edu
Received: from ringworm.cs.UMD.EDU (ringworm.cs.umd.edu [128.8.128.41]) by postoffice.npac.syr.edu (8.7.5/8.7.1) with ESMTP id NAA28090 for <gcf@npac.syr.edu>; Mon, 29 Apr 1996 13:56:17 -0400 (EDT)
Received: by ringworm.cs.UMD.EDU (8.7.5/UMIACS-0.9/04-05-88)
	id NAA01017; Mon, 29 Apr 1996 13:56:15 -0400 (EDT)
From: als@cs.UMD.EDU
Date: Mon, 29 Apr 1996 13:56:15 -0400 (EDT)
Message-Id: <199604291756.NAA01017@ringworm.cs.UMD.EDU>
To: gcf@npac.syr.edu
CC: saltz@cs.UMD.EDU
Subject: HPCC and Java report
Content-Type: text
Content-Length: 12887

Here are the Maryland sections in HTML, including the headers from your
version (mainly the old section numbers).

Alan

----------

<DD><B>2.4 An Example -- Remote Sensing and Data Fusion<I> [UMD]</I></B> 
<DD><I>[original text:</I><A HREF="http://www.npac.syr.edu/users/gcf/hpjava.html#sec2.5umd">2.5umd) Remote sensing data fusion<I></A>]</I> 
<DD>There are many applications that are likely to require the generation of
data products derived from multiple sensor databases. Such databases could
include both static databases (such as databases that archive LANDSAT or AVHRR
images) and databases that are periodically updated (like weather map
servers).

<P>Two examples of remote sensing defense applications are <STRONG>battle
management</STRONG> and <STRONG>distributed simulation</STRONG>.  A battle
management application
could take advantage of multiple sensing devices, both airborne and on
satellites.  Data from multiple remote sensing devices would be integrated
to look for various untoward events, such
as a collection of tanks on the move, aircraft taking off from
an enemy runway, etc.  A contemporary example would involve real time
tracking of rocket attacks in northern Israel.  One distributed
simulation application would be to train soldiers in a hybrid
environment, consisting of
 a combination of simulated and actual troops and equipment. For
instance, a commander could command some number of real troops, tanks and
planes,
as well as many more simulated troops and pieces of equipment. Developing
the technology to allow for such a hybrid training environment
is already part of an important  ARPA program that is well funded.
At the last ARPA meeting, the claim was made that some form of hybrid
battle training had been successfully tested in Europe.


<P>The execution model for these applications is for a client to direct
(High Performance) Java programs to run on the database servers. The Java
programs would make use of the server's computational and data resources to
carry out (at least) partial processing of data products. Portions of data
processing that required the integration of data from several databases
would be carried out on the client or via a Java program running on another
computationally capable machine.  The main question for such an model is
whether the programs executing on the servers are solely Java programs, or
Java programs that utilize the services of programs written in other
languages (e.g. High Performance Fortran).

<DD><B>2.5 An Example -- Medical Interventional Informatics<I> [UMD]</I></B> 
<DD><I>[original text:</I><A HREF="http://www.npac.syr.edu/users/gcf/hpjava.html#sec2.6umd">2.6umd) Interventional Epidemiology/Medical crisis management<I></A>]</I> 
</DL>
<P>The first set of applications apply to medical emergencies associated with
armed combat, terrorist actions, inadvertent releases of toxic chemicals,
or major industrial plant explosions. In such scenarios, the goal is to
rapidly acquire and analyze medical data as the crisis unfolds in order
to determine how best to allocate scarce medical resources. The applications
involve the discovery and integration of multiple sources of data that
might be in different geographical locations.  A doctor could, for instance, 
determine the best diagnosis and treatment for patients exposed
to combinations of chemicals whose effects have not been well characterized.
This class of scenarios is motivated by a recent ARPA-sponsored National
Research Council workshop on crisis management.

<P>Another set of applications involves discovering and exploiting clinical patterns in patient populations.
These scenarios involve study of health records to determine risk factors
associated with morbidity and mortality for specific subpopulations (e.g.
various substrata of armed forces personnel serving in a particular region).
In such scenarios, the goal is to provide high quality, low cost care
by exploring diverse clinical data sources for patterns and correlations
and by ensuring that significant data for individual patients is not
overlooked. For instance, a system of this type could rapidly track the
spread of serious 
infectious diseases such as Hepatitis B or HIV. It could also be used to
identify specific soldiers whose demographic profile or past medical history
might place them at particularly high risk of contracting the disease.

<P>The systems software requirements for these medical scenarios should be very
similar to the requirements motivated by the sensor data fusion scenario,
namely employing a client program to direct Java programs to run on the
servers for the relevant databases.  The data mining aspects of these
applications could also require the client to direct Java programs to run on
computational servers, which may not be at the same site as the
database servers.

<P>
<DD><B>3.4 Data Parallelism<I> [UMD]</I></B> 
<DD><I>[original text:</I><A HREF="http://www.npac.syr.edu/users/gcf/hpjava.html#sec1.6tex">1.6tex) What should NOT be done<I></A>]</I> 
<DD><I>[original</I> <I>text:</I><A HREF="http://www.npac.syr.edu/users/gcf/hpjava.html#sec1.6ind">1.6ind) Is Data Parallelism relevant for Java<I></A>]</I> 

<P>There are a variety of ways in which Java could be used to support data
parallel computing.  We assume that
Java may be extended in one of a variety of ways. 
In the following, we are also 
regarding any closely coupled set of machines as a parallel architecture.

<UL>
<LI> Java could be used as a 
mechanism for invoking data parallel codes on
remote compute or data servers. For instance, assume
that a user has access to a number of parallel machines.
On each of these machines, the user has already compiled a parallel
code. The user would make use of an agent written in an
extended version of Java to
find a machine with the necessary available computational resources or
necessary data, move the
required data to that machine, run the data parallel program and
move the computed results to the users machine or to another
designated location. 

<P>If the user's code has not already
been compiled on a given parallel machine, a Java agent
could migrate the parallel source code and
compile the user code on the remote machine.
This scenario assumes that all necessary compilers, runtime libraries, etc.
are either already available on the remote parallel machine or 
that the agent is able to install the necessary system software.
<P>

<LI> A Java code could be used to invoke data parallel objects and libraries
on a parallel architecture. Assume that a known set
of parallel libraries and parallel objects are available on
each machine. Data parallel computing on a given
machine would be carried out by running a copy of a Java bytecode program
on each processor (SPMD). The SPMD Java program would make a series
of calls to parallel libraries and objects.
<P>

<LI> A Java code could carry out a set of computations that make
use of several different parallel machines. Java agents would
target a collection of parallel architectures. The Java agents
would identify necessary parallel libraries and objects on
each architecture (and if necessary, install libraries and objects
that were not already available on a given machine). Java agents
would coordinate the computations carried out on the different
parallel architectures as well as make use of communication libraries
(such as Maryland's Meta-Chaos library)
that are able to directly move data between parallel programs, even those
running on different parallel architectures.
<P>

<LI> The Java language and Java compilers could be modified so that
users could employ an extended version of Java to write data parallel programs.
In this scenario, it would be necessary to define a data parallel dialect of
Java. It appears that many of the lessons learned in developing
and implementing High Performance Fortran would carry over to 
Java. The task of implementing a data parallel Java is simplified
relative to what would be required for a data parallel dialect of C or C++,
because the Java language does not include explicit pointers.

</UL>

<DD><B>8.1. At Maryland -- Migrating Programs <I>[UMD]</I></B> 
<DD><I>[original text:</I><A HREF="http://www.npac.syr.edu/users/gcf/hpjava.html#sec7.4.1umd">7.4.1umd) Mobile Programs<I></A>]</I> 

<DT> <B>Mobile Programs</B>
<DD> The University of Maryland is pursuing a project  designed
to provide a  single unified framework for remote access and remote
execution in the form of  a type of agent we call  an itinerant
program. Itinerant programs can execute on and move between the
user's machine, the servers providing the datasets to be processed
or third-party compute servers available on the network. The motion
is not just client-server; it can also be between servers. Programs
move either because the current platform does not have adequate
resources or because the cost of computation on the current platform
or at the current location in the network is too high. Various
parts of a single program can be executed at different locations,
depending upon cost and availability of resources as well as user
direction.
<P> The architecture also provides support for plaza
servers. Plaza servers provide facilities for:
<OL>
<LI> execution of itinerant programs,

<LI> storage of intermediate data,

<LI> monitoring cost, usage and availability of local and remote resources
on behalf of itinerant programs and alerting them if these quantities cross
user-specified bounds,

<LI> routing messages to and from itinerant programs, and

<LI> value-added access to other servers available on the network.
</OL>
Examples of value-added functionality
include conversions to and from standard formats and operations
on data in its native format. In addition to allowing cost and
performance optimization, itinerant programs and plaza servers
also provide a flexible framework for extending the interface
of servers, in particular legacy servers with large volumes of
data as well as servers that are not under the control of the
user. The University of Maryland has a prototype system that
supports plaza servers and itinerant programs. Program migration
is currently carried out either in response to an external signal
or when a potentially itinerant program invokes the necessary
runtime support. We are currently in the process of  adapting
resource monitoring software so that we can demonstrate the ability
to carry out program migration in response to fluctuations in
processor and network load.
<P>

<DT> <B>Coupling Data Parallel Programs</B>
<DD> Maryland has developed prototype software to demonstrate methods
designed to flexibly and efficiently couple multiple data parallel
and sequential programs at runtime.  The program coupling software
is used to allow the coordinated use of multiple separately
developed parallel programs in solving complex problems. The program
coupling software will also be used to couple itinerant programs.

<P>Maryland has developed a layer of runtime support (called Meta-Chaos)
to establish mappings and to carry out communication between data
structures in different sequential or data parallel programs.
Efficient data movement is achieved by pre-computing an optimized
communication schedule.  Our approach is to define a set of functions that each
runtime library should export and to build Meta-Chaos on top of
this minimal API. 

<P>At the level above MetaChaos, interaction between programs
occurs through a user-specified consistency model. Mappings are
established at runtime and can be added and deleted while the
programs being coupled are in execution. Mappings, or the identity
of the processors involved, do not have to be known at compile-time
or even link-time. A-priori knowledge of consistency requirements by a
runtime library
allows buffering of data as well as concurrent execution of the
coupled applications. 
<P>

<DT><B> Periodic Data and Computation Servers</B>
<DD> The integration of program coupling software into itinerant
programs will make it straightforward to develop long-running
itinerant programs that process periodically generated data. We
anticipate that a collection of itinerant programs will process
sequences of data from multiple sources where each source may
have a different periodicity. An itinerant program can either
visit each of the data sources in order or it can install a surrogate
at each data source, which is activated every time the dataset
is updated. A surrogate acts as a filter for the sequence of data
generated by the data source and communicates appropriate updates
to the parent itinerant program (possibly after preliminary processing).
For complex processing on a number of datasets, a hierarchy of
surrogates and result combining programs can be created.  The
result of such itinerant programs can either be a single accumulated
result or a sequence of results.