Given by Geoffrey C. Fox at Sun MicroSystems Java Day at CMU Pittsburgh on Sept 26 1998. Foils prepared Sept 30 1998
Outside Index
Summary of Material
We describe Java Grande -- definition, motivation and current status
|
The Java Grande Forum has numerical and distributed computing working groups and projects include
|
Discuss Java for Parallel Computing including message passing passing (MPI) and data parallelism |
Give example of implied multi tier service architecture |
Outside Index Summary of Material
Java Day CMU September 25 98 |
Geoffrey Fox |
Northeast Parallel Architectures Center |
Syracuse University |
111 College Place |
Syracuse NY |
gcf@npac.syr.edu |
http://www.javagrande.org |
http://www.npac.syr.edu/users/gcf/jgcmusept98 |
We describe Java Grande -- definition, motivation and current status
|
The Java Grande Forum has numerical and distributed computing working groups and projects include
|
Discuss Java for Parallel Computing including message passing passing (MPI) and data parallelism |
Give example of implied multi tier service architecture |
Use of Java for: |
High Performance Network Computing |
Scientific and Engineering Computation |
(Distributed) Modeling and Simulation |
Parallel and Distributed Computing |
Data Intensive Computing |
Communication and Computing Intensive Commercial and Academic Applications |
HPCC Computational Grids ........ |
Very difficult to find a "conventional name" that doesn't get misunderstood by some community! |
These exist from both a computer science and user point of view |
Grande applications are very complex but field is small (1% or so of total computing world)
|
The field needs Java as it provides a wonderful distributed computing software infrastructure on which to build applications and tools
|
Not clear that Java needs the field and so Grande field needs to be humble and persuasive in its requests |
Currently the Grande field is intrigued but skeptical due to poor Java performance |
Java community marching on in commercially critical areas |
Need to bring communities together |
Set of Workshops with increasing interest
|
Topics include compilation issues; applications; algorithms (math libraries); benchmarking; Java based programming environments(visualization); parallel computing and largest set of papers are in distributed systems |
Next meeting will be just before JavaOne 99 (Java developers conference) in March 99 to enhance interaction between Grande community and mainstream Java world |
Java Grande Forum to act as a focus for Grande community activities and coordinate the (feeble 1%) voice into mainstream! |
The Java Language has several good design features
|
Java has a very good set of libraries covering everything from commerce, multimedia, images to math functions (under development at http://math.nist.gov/javanumerics) |
Java has best available electronic and paper training and support resources |
Java is rapidly getting best integrated program development environments |
Parallel Computing is a special case of distributed computing |
Java naturally integrated with network and universal machine supports potentially powerful "write once-run anywhere" model |
There is a large and growing trained labor force |
Can we exploit this in Grande Applications? |
So existing Grande codes are written in Fortran C and C++ with a clearly unattractive and comparatively unproductive programming environment |
These current languages and tools are sufficient but does not seem likely that can build much better environments around them
|
Five years ago, it looked as though C++ could become language of choice (perhaps with Fortran as inner core) but this appears stalled
|
So there is no competition -- Java is currently our only hope
|
It has some natural advantages due its internet base with threads and distributed computing built in |
It is a young language and we can take steps now to avoid unproductive proliferation of libraries and parallel constructs
|
It could have expressivity and object oriented advantages of C++ combined with performance levels of C and Fortran |
It can use its clear GUI advantages as an entrée into other aspects of Grande programming |
Geographically |
Distributed |
Grandecomputer |
Resources |
Enterprise |
Middleware |
Gateway |
System |
Geographically Distributed users |
and consultants |
1 |
2 |
3 |
Java Applets |
Java Language |
Java Servers |
Java has potential to be a better environment for "Grande application development" than any previous languages such as Fortran and C++ |
The Forum Goal is to develop community consensus and recommendations for either changes to Java or establishment of standards (frameworks) for "Grande" libraries and services |
These Language changes or frameworks are designed to realize "best ever Grande programming environment" |
First Meeting Mar 1 Palo Alto at Java 98 -- 200 Attendees set Agenda -- 30 permanent people and further meetings May 9-10, Aug 6-7 |
Public Discussion SC98 Orlando November 13 (3 hour panel) |
http://www.npac.syr.edu/projects/javaforcse |
http://www.javagrande.org |
1) Most important in the near term -- encourage Sun to make a few key changes in Java to allow it to be a complete efficient Grande Programming Language
|
2) As a community, recognize that sometimes standards are more appropriate than creativity and pool results of experiments to produce a Java Grande framework covering libraries and computer access
|
1) requires us to work with the computing mainstream -- 2) is internal to community |
Two major working groups promoting standards and community actions |
Numerics: Java as a language for mathematics led by Ron Boisvert and Roldan Pozo from NIST
|
Java Grande 98 Feb 28 98 |
So Java not only will run anywhere but can be expected to get same answers everywhere
|
Natural tension between performance (both in terms of speed and precision) and reproducibility
|
Java has particularly bad floating point performance due to
|
Solution requires "Change in Java Rules" and better compilers |
IBM (Marc Snir) considers C = A*B where all are 64 by 64 matrices with triple DO loop to execute C[i][j] = C[i][j] + A[i][k] * B[k][j] running on RS6000 model 590 |
Initial disappointment:
|
Runtime checks (array indices in bounds, null pointer checks) could essentially be removed by better compiler
|
Using rectangular arrays (not vector of vectors) gives 44 megaflops (off by a factor of 6) |
Using Hardware fused multiply add gives 64 megaflops (factor of 4) |
Remaining factor requires use of associativity to block computation |
We propose three modes of floating point |
strictfp: Reproducible results as is now default |
default: Exploit natural hardware (extended exponent in Intel and fused multiply add) |
associatefp: Allow conventional compiler optimizations |
with new Intel algorithm, strictfp will be a little more than 2 times slower than default; associatefp gains will be algorithm dependent (regular problems will have greatest gain) |
Less important are following niceties: |
indigenous keyword specifies maximum precision format supported by hardware |
anonymous {float, double, indigenous} specifies floating point format to be used in calculating expressions
|
Distributed and Parallel Computing led by Dennis Gannon and Denis Caromel (INRIA, France)
|
Development of Grande Application benchmarks |
So good news is that RMI has enabled very active distributed computing research and indeed development as in JavaSpaces from Sun |
Performance is reasonable but insufficient for some applications
|
Forum suggests (optional) changes in several areas including
|
JacORB |
JWORB |
ORBIX |
RMI |
Transmit |
Variable Size |
Integer Arrays |
Best |
Worst |
Array Size |
Java ORBs Transferring |
variable size Array of Structures |
(RMI slowed by serialization) |
RMI |
JacORB |
ORBIX, JWORB |
Best |
Worst |
Array Size |
Arrays of Integers C++ about 20 times faster than Java |
RMI (Fastest Java) omniORB (C++) |
Best |
Worst |
Array Size |
Don't need to rewrite existing codes in Java!
|
Conduct suitable experiments in using Java in complete Grande applications |
Make certain your interests are represented in Java Grande Forum |
Does this change research agenda? (different types of compilers, service-based architectures, re-use commodity technologies -- don't roll your own with federal funds ...) |
Retrain your staff in Java Web and distributed object technologies |
Put "High Performance Grande Forum compliant" Java support into your RFP's for hardware and software |
JWORB - Java Web Object Request Broker - multi-protocol middleware network server (HTTP + IIOP + DCE RPC + RMI transport) |
Current prototype integrates HTTP and IIOP i.e. acts as Web Server and CORBA Broker
|
Next step: add DCE RPC support to include Microsoft COM |
JWORB - our trial implementation of Pragmatic Object Web |
Use our book "Building Distributed Systems on the Pragmatic Object Web" in your class |
Database |
Matrix Solver |
Optimization Service |
MPP |
MPP |
Parallel DB Proxy |
NEOS Control Optimization |
Origin 2000 Proxy |
NetSolve Linear Alg. Server |
IBM SP2 Proxy |
Gateway |
Supporting Java |
Framework |
for |
computing |
Agent-based Choice of Compute Engine |
Multidisciplinary Control (WebFlow) |
Data Analysis Server |
Is this really |
sensible? |
There are several forms of parallelism
|
In a Nutshell, Java is better than previous languages for a) and b) and no worse for c)
|
Thus "Java plus message passing" form of parallel computing is actually somewhat easier than in Fortran or C.
|
Coarse grain parallelism very natural in Java and we have described how to use this with RMI (see WebFlow example) |
"Data Parallel" languages features are NOT in Java and have to be added extending ideas from HPF and HPC++ etc
|
Java has built in "threads" and a given Java Program can run multiple threads at a time
|
mpiJava Performance |
C versus Java(J) |
WMPI PC with NT MPICH Sun Solaris |
Shared Memory |
PC using C |
Sparc using Java |
Best |
Worst |
mpiJava Performance |
C versus Java(J) |
WMPI PC with NT |
MPICH Sun Solaris |
Distributed Memory |
PC using C |
Sparc using Java |
Best |
Worst |
Both working groups have made substantial progress
|
We are initiating Community actions
|
Join us at SC98 November 13 |
Note European involvement has been excellent so far
|
So real computer users are not so interested in fancy metacomputing but rather in being able to run their jobs in a seamless way that does not keep changing as backend computer resources are upgraded |
Viewing computing as a distributed (object) service, need to define a "Java Framework for Computing Services" |
This enables development of Web Interfaces to run a given job on any computer with any data source compliant with this framework just as JDBC gives a universal interface to any relational database
|
The Computing Services Framework will allow vendors to compete on either User Front End (GUI) or back end services with the JavaCS framework providing universal linkage |
A "Framework" is a set of Java Calls (mainly Interfaces and not methods) to capabilities expressed in implementation neutral form
|
Drivers convert these general calls to vendor specific implementation of service
|
Requires agreement by "suitable interested parties" on
|
Abstract ideas developed in Condor Globus Legion and PACE POEMS PetaSIM for harder problems (metacomputing / performance specification) and developed for seamless problem by Sweb (Cornell) WebSubmit (NIST) or UNICORE (Europe) |
Grande Resource Discovery, Allocation and Scheduling
|
We are defining methods and properties of computers and programs viewed as distributed objects
|
Compiling, Executing, Specification of features needed for execution optimization
|
Accounting -- integrate with Web commerce technology? |
Authentication, Security (especially hard in metacomputing as link several different management policies)
|