Given by Geoffrey C. Fox at GEM Meeting NPAC on June 24-25 99. Foils prepared June 24 99
Outside Index
Summary of Material
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Outside Index Summary of Material
GEM Tutorial June 24 99 Geoffrey Fox |
Tom Haupt |
NPAC |
Syracuse University |
gcf@npac.syr.edu |
http://www.npac.syr.edu/users/gcf/gemcompsciencejune99/index.html |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
There is a "philosophy/architecture" called building "Portals to X"
|
There are distributed object technologies to label register and look up objects
|
There is a language Java which is more productive than previous languages such as Fortran or C++
|
There is a data structure metalanguage called XML which will allow the generation of application specific data specifications |
There are a set of services that are either
|
One tries to construct "toolkit" and "template" for building portals |
NCSA biology Workbench http://biology.ncsa.uiuc.edu was one of first computational portals |
Gateway is one of most technically advanced portal toolkits with a specific chemistry instantiation |
Java Grande at http://www.javagrande.org is a community activity to encourage use of Java in "real computing" and the necessary changes in Java infrastructure |
Distance Education and Object web curricula may be found at http://www.npac.syr.edu/projects/admijune99/ |
Tango Interactive at http://www.npac.syr.edu/tango supports "synchronous" sharing of objects designed otherwise for asynchronous use
|
SV2 (Scientific Visualization -- the 2nd effort) is a Java3D visualization system supported by Tango 2 |
There are two versions of Tango: Tango 1.4 (production and supporting Netscape on UNIX and Windows) and Tango 2 (in alpha testing) |
Internet Explorer will be supported on Tango 2 (September 1 99?) |
Macintosh may be supported on Tango 2 |
Using Tango? Best platform is Windows NT as Java/multimedia flaky on UNIX
|
One has
|
b) and c) can be defaulted to NPAC or set up specially for each community |
Tango has a well defined API to allow any application to be shared if C++ Java or JavaScript |
NPAC offers tutorials to users or system administrators |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Portals to X are essentially the name for an Object web system where it is designed to address a particular application X |
Portal to Syracuse University is by definition http://www.syr.edu |
Portal to NPAC is http://www.npac.syr.edu |
Portal to the world is http://www.yahoo.com/ or http://my.netscape.com/ |
Portal to latest news is http://www.cnn.com |
Portal to computational chemistry is http://www.osc.edu/~kenf/theGateway/PSEactivities/CCM.html |
Portal to stock trading is http://quote.yahoo.com/ |
http://www.npac.syr.edu/restricted/ is a portal to NPAC internal information |
http://www.ibm.com is the external portal to IBM for customers. There will also be an internal portal for IBM employees used to "run the company" |
http://www.amazon.com is a portal to books (and more) |
Kodak is interested in portals to family memorabilia |
More generally a portal is a web entrance to a set of resources and consists of a mix of information, computer simulations and various services |
For businesses portals generalize the concept of a a company Intranet and encompass domain of IBM main frames, Lotus Notes etc. |
For computing, portals are called Problem Solving Environments |
Market Value $6.6B |
Annual Sales $154M |
Annual Revenue Growth 210% |
Area 1998 2002 CAGR |
Content Management $1.2B $4.7B 40% |
Business Intelligence $2B $7.3B 38% |
Data Warehouse/Marts $.99B $2.6B 27% |
Data Management $0.18B $0.36B 18% |
Totals $4.4B $14.9B 36% |
http://www.sagemaker.com/company/lynch.htm or |
http://www.sagemaker.com/company/downloads/eip_indepth.pdf |
We can identify a set of tools that enable the construction of portals |
These are roughly equivalent to the tools needed to build a general application based on "object web technologies" |
There is also an architecture (explained ad nauseam later) implying multi-tier systems with standard compliant interfaces
|
A common portal architecture means that portals can be conveniently linked together
|
So we can discuss some special portals
|
The latter include several projects aimed at harnessing the power of the web to do computing
|
NPAC has focussed on networks of web servers and these fit the portal model well |
However there is most computing power in collections of web clients |
A server accepts input and produces output
|
IIOP and HTTP are two common protocols (formats of control data) for inter program messages |
A Web browser (Netscape or Microsoft) can access any server at "the click of a button" with data from user refining action |
Similar to invoking a web page |
"CORBA" or "WIDL" (pure XML CGI specification) is just CGI done right ...... |
Object Broker |
Fortran Simulation Code on Sequential or |
Parallel Machine |
Convert Generic Run Request into Specific Request on Chosen Computer |
Fortran Program |
is an Important |
Type of Object |
It can be built up from |
smaller objects |
e.g. Matrix |
library could be an |
object |
But perhaps more interestingly computing portals involve building a web based problem solving environment to link together all the capabilities needed to compute |
run programs and access dynamically status of jobs and computers -- in particular allow a uniform interface to running a given job on one of many backend compute servers |
compile and debug programs |
link diverse data sources with computations run on multiple backend machines |
visualize results |
web-based help systems and collections of related scientific papers |
computational steering i.e. interacting with a job (change parameters) based on dynamic results such as visualized results |
See http://www.osc.edu/~kenf/theGateway/ and http://www-fp.mcs.anl.gov/~gregor/datorr/ |
Application Integration |
Visualization Server |
Seamless Access |
Collaboration |
Security Lookup |
Registration |
Agents/Brokers |
Backend Services |
Middleware |
Bunch of |
Web Servers |
and Object |
Brokers |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Basic Vision: The current incoherent but highly creative Web will merge with distributed object technology in a multi-tier client-server-service architecture with Java based combined Web-ORB's |
Need to abstract entities (Web Pages, database entries, simulations) and services as objects with methods(interfaces)
|
COM(Microsoft) and CORBA(world) are competing cross platform and language object technologies
|
Javabeans plus RMI and JINI is 100% pure Java distributed object technology |
W3C says you should use XML which defines a better IDL and perhaps an object model -- certainly does for documents |
How do we do this while technology is still changing rapidly! |
Client Tier |
Javabean Enterprise Javabean |
Old and New Useful Backend Systems |
Back-end Tier |
Services |
Middle Tier |
Servers |
Need to use mix of approaches -- choosing what is good and what will last |
For example develop Web-based databases with Java objects using standard JDBC (Java Database Connectivity) interfaces
|
Use XML to record small databases in flat files |
Use XML to define all interfaces |
Use CORBA to wrap existing applications |
Use COM to access components of major PC applications such as Microsoft Excel and Word |
Use Java to build all Middleware |
Use Jini to implement dynamic registration of objects |
Use HTML and JavaScript to render everything |
1)Rendering of (Multiple) Objects 2)Proxy to some backend capability used to render |
input and output to and |
from service |
Database |
MPP |
Telescope |
File System |
1)Server acts as a broker |
and control layer |
2)Same software as client |
but higher performance |
multi-user |
3)Again service represented |
as proxy used as a token for |
control logic |
Services with |
specialized software |
and capabilities |
The Proxies and actual instantiation are linked by messages whose semantic content is defined (best) in XML |
The lower system level format can be HTTP RMI IIOP or ... |
The client proxy is for rendering input and output including specification of object |
The middle tier proxy allows choice of backend provider and functional integration (the user can specify integration at client proxy level) |
Real Capability |
XML |
XML |
Objects (at "logical backend") can be on client of course |
Front end can define a generic (proxy for a) object. The middle control tier brokers a particular instantiation |
Broker or Server |
XML |
Result |
XML Query |
Rendering Engine |
Browser |
Rendering Engine |
HTML |
Universal Interfaces |
IDL or Templates |
XML Request for service |
followed by return of XML result |
Need version 5 browsers with good XML support to properly implement this |
We draw three tier as minimum architecture but as previous diagram suggests, one is likely to build a more structured system with middle tier having more layers |
Network computer breaks client tier into two with simple HTML at user and Java servlets providing heavier weight capability on a different machine
|
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Note the hardware can be as little as 1 PC on your desk |
More interestingly it is your 64 PC Linux or Windows NT Cluster up to the cluster of 64 128 node SGI Origin's at Los Alamos
|
Software divides into several types |
Fortran Program |
PLSQL Database |
or ..... |
HTML Rendering |
Java/CORBA/WIDL Wrapper |
Style Sheets and Page Design |
"Glue" with (multiple) tier servers and XML inter tier communication |
The front end is some document consisting of a mix of HTML or XML
|
We will NOT discuss either how to code backend in PLSQL or Fortran or how to compose final rendered document in HTML |
The backend software can be parallel or sequential and simulation or information based
|
We need to define in XML its interface needed to
|
This backend program interface is defined as an XML file e.g. <program name="physicssimulation1"> <run domain="npac" machine="maryland" type="pc" os="nt" >c:\myprogs\a.out</run> <input type="htmlform" > <name>userinput</name> <field default="10" >iterations</field> .......... </input> <output> ...</output> </program> |
Becomes HTML form with name |
userinput and text field iterations |
with default value 10 on client |
For this example (running a physics program), we could use a specific machine as defined on previous foil (the Windows NT PC maryland) or a generic machine <run domain="any" machine="any" type="pc" os="nt" > |
In this case, middle tier broker, would look in a database (XML file) to find what machines were available and either automatically or with user input choose one. |
Both Software and Hardware are defined in XML |
Note databases and an XML files are logically equivalent |
JDBC can access either Oracle (Microsoft) database or XML ASCII file |
More generally XML can be thought of as general object serialization
|
The XML File <machines domain="npac" type="pc" > <machine os="nt" cpu="pentium2" memory="128" >maryland</machine> <machine os="nt" cpu="pentium3" memory="256" >georgia</machine> <machine os="95" cpu="mmx" memory="128" >foxport1</machine> ..... </machines> <machines domain="cis" > <machine os="solaris" cpu="sparcXX" > top</machine> ..... </machines> |
is equivalent to database tables such as |
Every field has data of special significance -- for field xxxxxx, we imagine a group of standards realized in XML and used for exchange of information. We call this xxxxxxML
|
http://www.xml.com/xml/pub/submlist lists some standards currently proposed for XML |
The Portal for xxxxxx must support xxxxxxML |
For businesses, perhaps one needs special support for "excelML" "SAPML" (XML export format for EXCEL or SAP) as well as support for more general forms of information
|
This we define as a group of defined formats that support scientific data, note taking and sketches |
XSIL (Scientific data Interchange) defines metadata needed to specify scientific data files including high level parameters and methods needed to read data
|
VML is Vector Graphics Mark up Language |
DrawML is designed to support simple technical drawings (easier than VML but VML should be able to do this?) |
VRML (3D scenes) re-implemented in XML as X3D (http://www.vrml.org/news/pr990210-content.html) |
MathML Mathematical Expressions |
CML Support Chemistry -- not clear if adopted widely |
Presumably this allows Scientists to make notes and record thoughts in a way that it supports important scientific constructs |
At its simplest this is an authoring tool like Microsoft Word, PowerPoint or Framemaker
|
One useful utility would be a whiteboard that supported scientific notes using ScienceML |
Such a collaborative whiteboard (implemented in Tango for instance) would be useful in research and teaching
|
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Example comes from WebWisdomDB for storing training and education curricula objects -- PowerPoint, HTML, Photos .. |
XML embedded as part of HTML "Web Page" |
URL of Web Page invokes Servlet |
e.g. http://witch.npac.syr.edu/servlet/TDLServlet/witch/index.tdl?USID=67 |
Servlet leaves HTML untouched but calls Java Methods to process XML tags |
This Servlet processes <WW_... tags -- others are left for other servers or browser |
<!-- WW_CONNECT is a tag that establishes a connection to the database. Connection string may be supplied in the CONNECTION attribute, or if it is omitted (empty string) the default connection string is taken from configuration file for current template.--> |
<WW_CONNECT CONNECTION=""/> |
<html> <head> <title> Separate IMAGE for LOCAL foil |
<!-- WW_FOILNUMBER tag takes the position of the current foil in the current presentation. --> |
<WW_FOILNUMBER PARENTID="${PID}" FOILID="${FID}"/> |
<!-- WW_TITLE is a tag that allows to insert title of a presentation or foil. The presentation/foil is identified by an ID supplied in FOILID attribute. PARENTID is not currently used by the WW_TITLE tag, but can be later used, e.g. to check the formatting properties, which can be defined on the presentation level. FID and PID were supplied by the servlet from query attributes of the URL --> |
Yellow is ordinary HTML, Green XML, White Comments |
So this mix of HTML and XML is processed by a special Java Server and XML is translated as shown in green |
The yellow HTML is passed throuigh untouched |
The URL handed to servlet uniquely specifies page in this case. Better one could pass an XML specification of page |
<html> <head> <title> Separate IMAGE for LOCAL foil 17 while this part came from database</title> |
<body bgfile="whatyouwant.gif" > |
<tt><a href="#image">Image</a><a href="#buttons"> Buttons</a> </tt> <b> |
<a href="currenthelp.html"> HELP!</a> * GREY=local</b><tt> LOCAL IMAGE version of Foils prepared |
May 23 1999 </tt> |
<WW_TITLE FOILID="${FID}"/> |
</title> </head> |
<!-- WW_BODYIMAGE inserts a 'body' HTML tag with background image typical for foil files. --> |
<WW_BODYIMAGE/> |
<tt><a href="#image">Image</a><a href="#buttons"> Buttons</a> </tt> <b> |
<WW_LINK FILENAME="temphelp.tdl" ATTR="">HELP!</WW_LINK> * GREY=local</b><tt> LOCAL IMAGE version of Foils prepared |
<!-- WW_MODIFICATIONDATE inserts modification date of the current foil --> |
<WW_MODIFICATIONDATE FOILID="${FID}"/> </tt> |
...... And So On! |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
Tango is a fully Web integrated |
system that can share general client side applications, web pages and output from databases and other servers |
TangoInteractive is being deployed to support: |
Distance Education and Training which links asynchronous and synchronous modes with database or web server backend
|
Collaborative Visualization and Computing
|
Shared Client Side Java applets for "Rapid Prototyping" Phase |
Shared emacs for John Salmon .... |
Links Interactive (synchronous) and self-paced (asynchronous) collaborative models
|
Runs on Windows 95,98,NT UNIX (Irix, Solaris, Linux) |
Supports multi-language shared server and client applications in Java, Javabeans, C++ and JavaScript (W3C DOM) |
Supplement existing very general tools (audio-video conferencing, chats, whiteboard etc.) by specialized applications to support particular collaborative activities
|
Upgrade core technology in areas such as browser independence, security, multiple community support, archiving |
Improve User Support and documentation |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
So between XML/HTML document and backend Fortran is a bunch of servers linked by XML Messages |
Software of this glue (business logic) is built in Java |
Servers can either be commercial Web Servers (Apache with servlets), CORBA brokers or custom Servers
|
Java/CORBA/WIDL Wrapper |
Style Sheets and Page Design |
We can copy a much reviled model -- Microsoft Word or PowerPoint -- Problem Solving Environments for document preparation -- to get PSE for Computing |
XML Widgets are organized into Toolbars .... |
Computing abstracted as a set of hierarchical Toolbars Toolbars are defined in XML and rendered in HTML for user interface. XML interpreted on middle tier as some suitable service. |
Computing Toolbars include user profile, results, visualization (where "command" could be AVS), collaboration, programming model, HPF, Dataflow, resource specification, resource status, code (application specific) |
Abstract and Overview |
Portals to Computing and Business
|
Role of XML in Portal Construction
|
Collaboration |
Portal Programming |
How to get High Performance in a multi tier model |
1)Simple Server Approach 2)Classic HPCC Approach |
Data and Control |
CFD |
Structures |
Data Only |
CFD Server |
Structures Server |
Control |
Only |
3)Hybrid Approach with control at server and |
data transfer at |
HPCC level |
Can switch at each mpi_init or at each MPI message |