Given by Geoffrey Fox, Nancy McCracken at Tutorial: ITEA HPCC Conference Aberdeen Md. on July 13 98. Foils prepared July 15 1998
Outside Index
Summary of Material
This tutorial covers basic technologies underlying modern Large Scale Enterprise Systems which enable productive services built around web, database and distributed object technologies |
This talk surveys structure of tutorial and defines the pragmatic object web approach that unifies these as a multi-tier server architectures |
The rest of tutorial does a set of examples plus basic surveys of technologies : JDBC CORBA RMI Livewire ColdFusion Lotus Notes JWORB PL/SQL |
Outside Index Summary of Material
ITEA Tutorial HEAT Center Aberdeen July 13 1998 |
Geoffrey Fox, Nancy McCracken, |
Chao Wei Ou, Shrideep Pallickara, Tom Pulikal |
Deepak Ramanathan, Mehmet Sen, Yuping Zhu |
Northeast Parallel Architectures Center |
Syracuse University |
111 College Place |
Syracuse NY |
gcf@npac.syr.edu |
This tutorial covers basic technologies underlying modern Large Scale Enterprise Systems which enable productive services built around web, database and distributed object technologies |
This talk surveys structure of tutorial and defines the pragmatic object web approach that unifies these as a multi-tier server architectures |
The rest of tutorial does a set of examples plus basic surveys of technologies : JDBC CORBA RMI Livewire ColdFusion Lotus Notes JWORB PL/SQL |
(Business) Logic can be a client, Middleware Server or specialized service layer |
Choices in distributed object (database record is "just" a distributed object) specification |
Different transport protocols |
Client |
Student Record Database and Grading System
|
Document Systems: Web Site Management and SCCS Technical Reports, Robot Search System
|
Web Log Access Storage and Visualization System
|
WebWisdom NT "Virtual University" Server
|
A Simple Distributed Collaborative Environment
|
JWORB Multi-protocol server
|
Lotus Notes Examples illustrating asynchronous groupware
|
ColdFusion high level interface to database
|
Pragmatic Object Web - Integrate with Web competing models for distributed objects: Java, CORBA, COM, WOM |
POW is middleware for multi-tier distributed enterprise applications
|
High Performance commodity computing - traditional HPC modules managed by POW on new commodity clusters (PC with NT, Linux or Solaris OS) using Distributed Computing Concepts (HLA,RTI) at coarse grain and classic HPCC for computational kernels |
|
W is Web Server |
PD Parallel Database |
DC Distributed Computer |
PC Parallel Computer |
O Object Broker |
N Network Server e.g. Netsolve |
T Collaboratory Server |
Clients |
Middle Layer (Server Tier) |
Third Backend Tier |
3-(or more)-tier architecture - Web browser front-ends, legacy (e.g. databases, HPC modules) backends; fat (1+tier) middleware |
Alternative / competing Middleware models:
|
Each model has different tradeoffs |
POW attempts at integrating various models and services in terms of multi-protocol middleware servers (JWORB) |
Basic Vision: The current incoherent but highly creative Web will merge with distributed object technology in a multi-tier client-server-service architecture with Java based combined Web-ORB's |
COM(Microsoft) and CORBA(world) are competing cross platform and language object technologies
|
Need to abstract entities (Web Pages, database entries, simulations) and services as objects with methods(interfaces)
|
How do we do this while infrastructure still being designed! |
Major Commercial Java Activity today is on Server NOT Client |
One can anticipate this by building systems in terms of Java objects e.g. develop Web-based databases with Java objects using standard JDBC (Java Database Connectivity) interfaces
|
Even better use (Enterprise) Javabeans which are Java's (middle tier) or client componentware offering visual interfaces, containers (here they are consistent with CORBA standard) and standard software engineering interfacing rules
|
Confused? Read "Building Distributed Systems on the Pragmatic Object Web" -- Book of class I teach to CS/CE students at Syracuse http://www.npac.syr.edu/users/shrideep/book |
Documents -- URL |
"General Programs including database invocations"
|
Middle Server Tier |
Basic HTTP/CGI Web Server |
Java Web Server |
Transaction Processing Server |
Business Transaction Management |
Javabean |
Enterprise Javabean |
Old and New Useful Backend Software |
Object Broker |
Back-end Tier |
The Services |
Client |
Front-end Tier |
Server: e.g. |
Proprietary |
Database |
Lotus Notes |
Web or ORB |
Service e.g. Database Repository or file systems accessed by Web Servers |
Client |
Now in POW style, we add modular capabilities to get 3 4 or more tier |
Back End Server: e.g. |
Proprietary |
Database |
Service e.g. Database Repository |
ThickClient e.g. Java Applet GUI |
Middle Tier Server with "Business Logic" e.g. map user objects to relational tables as in Java Blend |
We get 4 tier by refining client .... |
Back End Server: e.g. |
Proprietary |
Database |
Thin Client e.g. pure HTML viewer |
Middle Tier Server with "Business Logic" e.g. map user objects to relational tables as in Java Blend |
Java Web Server e.g. with servlet converting Java GUI into HTML |
But Middle Tier can be a plethora of servers linked in a dataflow model |
Client |
Middle Tiers |
Back End |
Thin Client |
Old way: Use an Object Database |
Current Approach: Use a Relational Database and business logic in EJB |
Object Database |
Application using data objects |
Backend relational database such as Oracle |
Enterprise Javabean mapping user object to backend persistent data model |
Application using data objects |
Middle Tier |
Clients and their servers |
Middle Tier Custom Servers |
Back End Servers and |
their services |
The backend servers would include CORBA objects from Educom's IMS projects; Video servers and Oracle database defined curricula pages from NPAC |
The front end servers would include distributed students with mirror sites to get performance |
In the middle tier, we have JDBC query processing and XML servlet parsers mapping original data in optimal fashion to match needs of student -- choosing from pure HTML or Interactive Java Whiteboard views of a given object
|
Educational Objects i.e. |
Data Defining Content of Curricula Pages |
Server side |
Java(JDBC) or |
LiveWire |
Metadata |
Web Server |
Conventional HTML Pages |
Dynamically Generated |
Including XML syntax Dublin Core (IMS) |
Web Browser |
XML Templates Defining How educational data stored in Pages |
Systems like Tango or Habanero built around Java Servers integrate a group of multiple clients as a "Service" at the middle Java Server level |
Building systems in this way automatically includes "people in the loop" -- Computational Steering, Education, Multidisciplinary collaborative design |
Group of collaborating clients |
and client applications |
Database |
Object Broker |
MPP |
NPAC Web Server |
JSU Web Server |
JSU Tango Server |
... |
Audio Video Conferencing Chat Rooms etc. |
Address at JSU of Curriculum Page |
Teacher's View of Curriculum Page |
Student's View of Curriculum Page |
Participants at JSU |
Teacher/Lecturer at NPAC |
Java Server |
Java application
|
100% maintenance free and Industry-strength stability |
Platform-independent
|
Java for Servers is dominant industry development as supports thin clients which are preferred as
|
Java Tango |
Server |
Netscape Browser |
Tango |
Daemon |
Shared Applet 1 |
Shared Applet 2 |
Shared |
Java/C++/.. |
Application |
Socket Connections |
Client Side Bus |
Netscape's |
LiveConnect |
Typical Client |
Other |
Collaborating |
Clients |
Shared |
JavaScript/ Web Page |
Tango CA |
We have multiple supercomputers in the backend -- one doing CFD simulation of airflow; another structural analysis while in more detail you have linear algebra servers (Netsolve); Optimization servers (NEOS); image processing filters(Khoros);databases (NCSA Biology workbench); visualization systems(AVS, CAVEs)
|
All linked to collaborative information systems in a sea of middle tier servers(as on previous page) to support design, crisis management, multi-disciplinary research |
Database |
Matrix Solver |
Optimization Service |
MPP |
MPP |
Parallel DB Proxy |
NEOS Control Optimization |
Origin 2000 Proxy |
NetSolve Linear Alg. Server |
IBM SP2 Proxy |
Gateway Control |
Agent-based Choice of Compute Engine |
Multidisciplinary Control (WebFlow) |
Data Analysis Server |
High Performance Computing and Communication Tier |
Clients |
Gateway Systems |
Seamless Interface -- an Enterprise Javabean which processes input from user's Java Applet interface and maps user generic commands to those on specific machine
|
Resource management of heterogeneous MPP backend (linked to seamless interface) |
Database and Object Brokers |
Collaboration Servers including Tango, Lotus Notes and other commercial systems |
Visualization Servers |
"Business Logic" to map user data view (e.g. objects) to persistent store (e.g. Oracle database) and simulation engine (MPP) preferred format |
Most of a Command and Control Application |
Several FMS and IMT Applications |
Some I/O Intensive applications |
High value services with modest computational needs e.g. grid generation and other pre-processing, data manipulation and other post-processing |
Video Servers for Training |
Design and Planning Tools |
"Glue" for Multidisciplinary Interactions |
Control of metacomputing applications |
Distributed Computing becomes a commodity article (driven by Web Technologies) |
Market niches for orthodox MPP style HPC are shrinking |
NT clusters become a viable and more cost effective alternative to classic high performance systems |
HLA/RTI from distributed simulation community natural for coarse grain while MPI/HPF/.... Natural for fine grain -- must integrate which we claim can be done using a multi tier architecture |
Web/Commodity software (Pragmatic Object Web) - promising base to build new HPcc (commodity computing) |
HPCC has developed good research ideas but cannot implement them as solving computing's hardest problem with 1 percent of the funding
|
We have learnt to use commodity hardware either
|
Let us do the same with software and design systems with maximum possible commodity software basis |
The world is building a wonderful distributed computing (information processing) environment using Web (dissemination) and distributed object (CORBA COM) technologies |
This includes Java, Web-linked databases and the essential standards such as HTML(documents), VRML(3D objects), JDBC (Java database connectivity).
|
We will "just" add high performance to this commodity distributed infrastructure
|
The alternative strategy starts with HPCC technologies (such as MPI,HPF) and adds links to commodity world. This approach does not easily track evolution of commodity systems and so has large maintenance costs |
Larry Smarr and NCSA Collaboration have stressed analogy of deployment of computer/communication technology with impact that electrical and transportation grids had
|
The transportation system was built using lessons from and feed up/down from Sports cars, Cadillacs, Model T's, Ford Escorts etc. |
Computational Grid will be shaped by and shape all applications and technologies |
Proposed Education program Internetics expresses synergy between high-end and commodity approaches |
A computational grid is a metacomputer or a "high performance distributed computer system" which must be influenced by and influence the "Object Web" which is here defined as "mass-market"/business IntraNet (low to low) use of Internet/distributed Information Systems |
Essential idea is consider a three tier model
|
Preserve the first two tiers as a high functionality commodity information processing system and confine HPCC to the third (lowest) tier.
|
1)Simple Server Approach 2)Classic HPCC Approach |
Data and Control |
CFD |
Structures |
Data Only |
CFD Server |
Structures Server |
Control |
Only |
3)Hybrid Approach with control at server and |
data transfer at |
HPCC level |
4)Invoke High Performance Message Transfer between Observers and Sources specified in Message Event |
3)Source Callbacks Listener with Message Event |
Listener |
Source Control |
1)Register Listeners |
with Master Source |
Server Tier |
Data Source |
Data Sink (Observers) |
5)Actual Data Transfer |
High Performance Tier |
2)Prepare |
Message Event in Source Control |
1)Register Observers with Listener |
Client (Tier 1): Java Graph Editor for Webflow or interpreted debugger (DARP) linked to Java Visualizer SciVis
|
Middle Tier 2: Network of Java Servers linking UNIX and Windows NT systems with "all" services |
Back-end Tier 3: Globus where available. In early 98, this is high performance UNIX system links with no databases and no NT |
Note this is a good high performance I/O architecture whether file system, CORBA or database based |
Next foil shows
|
Client Tier |
IIOP High Functionality |
Middle Tier |
Future Globus |
Globus |
Future Parallel I/O |
They are Java's implementation of "component-based" visual programming |
This modern software engineering technique produces a new approach to libraries which become a "software component infrastructure(SCI)" |
There is a visual interface to discovery of and setting of values of and information about parameters used in a particular software component |
JavaBeans uses the event model of JDK1.1 to communicate between components
|
One expects Javabeans to become the CORBA component interface |
The visual interface allows inspection of and implementation of both individual beans and their linkage . This visual construction of linkage allows one to form nontrivial programs with multiple communicating components
|
Apart from the event mechanism which is a communication/linkage mechanism, ComponentWare (and JavaBeans in particular) "just" give a set of universal rules (needed for interoperability) for rather uncontroversial (albeit good) object-oriented and visual programming practices
|
Currently WebFlow uses a Java Server and manipulates Java applications which can be front ends with native methods to Fortran C or C++ routines |
Change Java Server to JWORB -- server integrating HTTP and IIOP (Web and CORBA) |
Change Java Applications to JavaBeans and non-Java apps to CORBA objects |
Change linkage in WebFlow to respect JavaBean event mechanism |
Then we get HPComponentware |
And using our multi-tier model high performance CORBA |
WebFlow is HPCC version of a |
Typical Visual Interface for JavaBeans |
This combines TANGO for collaboration with WebFlow to link server side applications |
If necessary WebFlow could support high performance inter-module communication as in structures-CFD Linkage example but it would always implement control at middle tier and this allows TANGO integration with server side computation
|
WebFlow communication model is a dynamic dataflow |
Of course other server side compute models are possible and in general need (web-linked) data bases, file systems, object brokers etc., |
On client one can share tools such as CAD systems like CATIA or AUTOCAD so Tango interfaces with API to these system and drives "slaves" from state extracted from linkage to master. |
WebFlow supports dataflow model in middle tier where user must supply routines to process input of data that drives module and output of data for other modules |
TANGO supports shared state and user supplies routines that read or write either
|
One can write Tango linkage for applications like AUTOCAD as vendor supplies necessary API |
CFD |
Structures |
DoD modeling and simulation (FMS,IMT) community is currently evolving towards the HLA(High level Architecture) framework with the RTI (Run Time Infrastructure) based communication bus. |
The goal of HLA/RTI is to enhance interoperability across more diverse simulators than in the DIS realm, ranging from real-time to time-stepped to event-driven paradigms. |
HLA defines a set of rules governing how simulators (federates) interact with each others. Federates describe their objects via Object Model Template (OMT) and agree on a common Federation Object Model (FOM). |
The overall HLA/RTI model is strongly influenced by the CORBA architecture and in fact the current prototype development is indeed CORBA based. |
Building HPCC on the Object Web implies that we can a common framework for both distributed (event driven) simulations and classic time stepped parallel computing |
We can support any given paradigm at either high functionality (web server) or high performance (backend) level |
HPCC Messaging could be a Java/RMI middle tier MPI or Nexus/Optimized Machine specific MPI at backend |
JWORB supports CORBA based RTI already and we can bridge to high performance event driven simulation systems like SPEEDES at the high performance backend layer |
However most problems can be thought of a set of coarse grain entities which are internally data parallel but the coarse grain structure is "functional" parallelism |
So HLA/RTI is especially natural as tier 2 management level of these coarse entities |
Entities can be time synchronized simulations and use MPI(HPF?) at either middle or back end tier or in fact as in DMSO simulations a federate running a custom discrete event simulation |
Resource Management typically breaks down into either
|
So a) is all at middle tier and should use commodity solutions -- there are many queuing systems such as Condor, Codine, LSF which we can "wrap" and Microsoft does not yet have a fully scalable commodity solution
|
So it is still embryonic but we suggest adopting the HLA/RTI framework as this supports job placement, interdependencies (time management) and hierarchical systems of federations --> federates |
Optimized data placement has been largely solved as a mathematical problem by HPCC but not packaged broadly. Our suggestion suggests how to invoke as backend support for a commodity service |
So we have a hierarchy of entities Federation --> Federates --> Objects where can have many tiers in each category |
A Federation could be the set of all jobs to be run on a particular site |
A Federate could be a job consisting of multiple possibly shared objects |
Objects are just data structures in HLA -- you send interaction events instead of invoking methods |
These aspects are organized by Federation, Object and Ownership management services |
We can classify both jobs and computers as separate federations |
Declaration Management corresponds to publication and subscription model of matching services and needs
|
Time Management corresponds to scheduling of sequenced events in discrete event simulations -- it will allow support generally dependencies in jobs -- the CAVE visualization system must be used after simulation |
Data management is classic "load-balancing" problem of parallel computing where you map objects optimally to computers to minimize communication cost and load imbalance |