Given by Geoffrey Fox at CPS616 Basic Information Track for Computational Science on Winter-Spring Semester 96. Foils prepared 22 January 1996
Abstract * Foil Index for this file
See also color IMAGE
This surveys "old" Web Technology characterized by passive browsers and CGI enhanced servers. This is contrasted with the major new Web Technologies including VRML, PERL5, Java and JavaScript and illustrated by Netscape 2.0 |
We discuss the integration of the best technologies from "other computing arenas" (from PC to HPCC) including database, collaboration, Compression, GIS, Security, Network Protocols, CORBA, Multimedia Servers as well the many physical infrastructures of importance. |
Emerging overall Web Concepts such as WebWindows WebWork and WebScript |
Further major changes with the support of full televirtuality are expected with the evolution of interactive 3D worlds in VRML 2.0 |
This table of Contents
Abstract
Instructor: Geoffrey Fox |
teamed with Wojtek Furmanski, Nancy McCracken |
Syracuse University |
111 College Place |
Syracuse |
New York 13244-4100 |
This surveys "old" Web Technology characterized by passive browsers and CGI enhanced servers. This is contrasted with the major new Web Technologies including VRML, PERL5, Java and JavaScript and illustrated by Netscape 2.0 |
We discuss the integration of the best technologies from "other computing arenas" (from PC to HPCC) including database, collaboration, Compression, GIS, Security, Network Protocols, CORBA, Multimedia Servers as well the many physical infrastructures of importance. |
Emerging overall Web Concepts such as WebWindows WebWork and WebScript |
Further major changes with the support of full televirtuality are expected with the evolution of interactive 3D worlds in VRML 2.0 |
Firstly we can use this technology to implement HPCC on a broad technology base
|
Secondly we can use technology to implement Virtual University to teach internally and across the Globe |
Thirdly we can teach our students about these concepts
|
Application Specific NII Specific Services for
|
Browsers have SAME interface on ALL Computers |
CGI Programs are typically written in PERL but can be essentially ANY UNIX Process and so do simulation, database access, advanced document processing etc. |
There are evolving/confusing/overlapping capabilities ... |
Clients (such as Mosaic and Netscape) support browsing of hyperlinked documents but have no internal interactive/compute capability |
Servers read HTTP and deliver requested service to client |
HTML -- a document format supporting hyperlinks |
HTTP -- a Transport Protocol defining Interaction between Web servers and Clients |
MIME -- a data format allowing agent-like (extended email) communication |
CGI -- a standard interface allowing sophisticated server extensions |
PERL -- a rapid prototyping language(script) aimed at text and file manipulation |
Web Search engines such as YAHOO, HARVEST, WAIS -- early distributed database access technology supporting search and indexing |
net.Thread, WebTools, RealAudio are early Web Interactive services |
Relational databases -- Oracle,DB2 have Web Interfaces |
Collaboration from Console Units (PIctureTel, CLI), Desktop (SGI Inperson) to MOOs |
Compression from MPEG and Wavelet to host of proprietary solutions -- a faction of 20 to 200 saving in space and bandwidth |
Geographical Information Systems |
Security will enable commerce on the Internet -- essential for Defence as well |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Oracle 7 Interface to Usenet-Prepared October 27,1995 |
Associated material may be found starting at Oracle-Web Interface to Usenet and other Services |
Oracle 7 Interface to Usenet-Prepared October 27,1995 |
Associated material may be found starting at Oracle-Web Interface to Usenet and other Services |
Natural Storage Format for particular type of Information |
Optimal Format for network transmission incorporating synchronization as in audio and video streams as well as compression |
Local Client formatting to (HTML,VRML) needed for standard browser display standards |
Combines strengths of Web and Database Information models to eliminate many weaknesses of each |
Uses Oracle's WOW Web-Oracle-Web Interface |
Many capabilities demonstrated in NPAC's implementations with mh mail, newsgroups, education databases, remote data entry |
Important for research, education and industry |
Characteristics
|
Strengths
|
Weaknesses
|
Characteristics
|
Strengths
|
Weaknesses
|
Database techniques used in Web technology: data storage; data caching; index searching; data processing |
Networking techniques used in distributed database technology: distributed database; two-phase commit; data replication; client/server model |
Web server integrated with database is enhanced with:
|
Database server linked to web server is enhanced with:
|
Mail databases: internal corporate utility
|
Usenet Newsgroups: http://asknpac.npac.syr.edu/
|
Education databases
|
Health care: demo patient record database |
Oracle SQL*TextRetrieval full text search of 3 online books |
Corporate product databases (under development) |
Education
|
Research
|
Industry
|
Note: the gateway wowstub program simply passes PL/SQL program name and input parameters gathered from forms to DB server. |
The DB server does both SQL query and HTML processing/formatting |
We describe the general architecture and major components of a Web Search System |
(a short version prepared for SC'95) |
See longer HPDC95 Version for more details |
Information Discovery - Locate Relevant Sources (URLs) with Reasonable Efforts/Time |
A Centralized Web Data Repository- Cache/Replicate Information to Alleviate Regional Network and Server Overhead |
A Unified Internet Search Interface - Search for Various Information Sources, HTTP, FTP, Gopher, WAIS, Usenet Newsgroups, Archive, On-line Databases and Libraries, etc. |
Data Volume
|
Data Diversity
|
User Base
|
There are at least 30 web search systems on the net |
InfoSeek - free service for web search (text database indexed from 400K URLs, total 2GB), paid-service for 15,000 USENET newsgroups (most recent 4 weeks, 2 million articles, total 7GB) and other on-line databases. Full-text indexing. Database and web servers run on 8 SUN10s |
Lycos - free service for web search (database indexed from ~10 million URLs, 1.8 GB summary text, 1.1 GB inverted index (10-20% of full text), run on 7 replicated workstations) |
OpenText - free service (text from ~1 million URLs, 985 million words, run on a worstation cluster). Full-text indexing. |
WebCrawler - free service for web search. Partial-text indexing. |
Yahoo - hierarchical listing of URLs by topics. A web site, not a search service (custom-made database system and web servers, run on several SGI Indy's and Pentium-based PCs running UNIX) |
Gather WWW pages/files from remote web servers and filter them into indexed text database |
Use 'Web Robot' or 'Web Agent' technology - a class of programs that automatically traverse network hosts and bring back information via various network protocols (e.g. HTTP) |
Major issues - direct impact on database size, search coverage and performance
|
How text of web documents/files are internally stored/indexed in the text database to efficiently and effectively support searching |
Common approach - 'inverted index' |
Major issues - direct impact on database size and search performance
|
Built on the indexed database |
Basic functions/algorithms - keyword-based search
|
Advanced functions - concept-based search
|
Form-based CGI - integration of a Web server and the backend database search engine |
Requires high-performance server to support large number of concurrent users - parallel technology can play a big role here ! |
Major issues
|
ATM, ISDN, Wireless, Satellite will be hybrid physical implementation of NII |
CORBA, Opendoc, OLE, SGML, Hytime are critical file and document standards |
High Performance Multimedia servers to enable digital information delivery on demand |
Data transport from MPI/MSGWAY/PVM to AAL to CBR/VBR |
Windows95/NT -- the last of the the non social(Web) operating systems -- will follow dinosaurs(IBM mainframes) into extinction except as WebServer/Client platforms with only base operating system services |
Personal Digital Assistants -- WebNewtons done right -- Learn from Telescript (agent based communication) and Magic Cap operating system |
WebWindows -- the open nonproprietary operating system of future supplanting UNIX, Windows95/NT, Apple etc. -- manages with a single interface all machines either individually or collectively on the NII |
WebWork -- Implements Computing for both Simulation and Information ontop of WebWindows-- the correct implementation of HPCC ideas such as HPF,MPI with pervasive technologies and good software engineering |
WebScript -- The evolving Middleware of scripted languages including PERL5, Java, Telescript, MOVIE (NPAC early prototype), domain specific Problem Solving Environments |
This will lead upto Ultimate Goal! Televirtuality -- All Web Users are linked into a single virtual world |
Java -- an interpreted C++ like language (script) allowing fully interactive clients which execute applets. Has full set of classes to make clients such as HOTJava. Licensed by Netscape |
VRML -- a 3 dimensional HTML allowing universal description of physical objects and allowing interchange of virtual worlds, commercial product designs etc. |
PERL5 -- an extension of PERL4 with full object oriented characteristics and extended pointer(array) constructs -- allows construction of Web Software obeying good software engineering practices |
Telescript -- forced into semiopen by Java (!?) -- dynamic Web Transport and Server technology replacing HTTP,MIME .. |
Multithreaded WebServers integrating current Web, Compute and digital multimedia delivery services -- future Enterprise Systems |
An example of HotJava applet that makes essential use of Java multithreading. |
Three different sorting algorithms are visualized on a single HotJava page. |
Each algorithm can be started independently or they can all run concurrently. |
Concurrent mode allows for real-time visual comparison of various algorithms and their performance. |
Latest results prepared for HPDC95 Tutorial August 1,1995 |
HotJava Demonstration |
HotJava Demonstration |
VRML illustrates how one can store real world objects in a universal fashion |
Game vendors can build modules that interact and enable development of amazing profitable virtual worlds! |
Manufacturers can use VRML as basis of universal product definitions enabling collaborations between several vendors needed for Multidisciplinary analysis and design cf: PDES/STEP standards |
The Web "levels" the playing field for all software products
|
For instance VRML allows new powerful versions of Geographical Information Systems |
Living SchoolBook Material for SC95 San Diego Dec 95 |
Living SchoolBook Material for SC95 San Diego Dec 95 |
Using San Diego VRML Viewer Webview |
Little Neck Bay in Northern Long Island (altitude exaggerated by factor 7) |
From Living Schoolbook Project |
Hot buttons linking to weather page in Albany area |
From Living Schoolbook Project |
WebTools -- Early NPAC Prototype of WebWindows Equivalent to Program Manager with Navigation, File manipulation, Mail |
WebDeskTop Publishing -- an early killer application under WebWindows supplanting Word, Wordperfect, LOTUS123 , Persuasion etc. Java allows clear powerful implementation. |
WebRDBMS -- Integration of Relational and Distributed databases with both agent based heuristics, formal indices and free text search |
Metadata -- Common attributes to allow integration and search of heterogeneous databases |
WebSpace -- Televirtual implementation of full 3D MOO like environment building on LabSpace at Argonne for the virtual scientific laboratory |
WebFlow -- NPAC prototype of Web based extended Khoros/AVS supporting dataflow linkage of computers for simulation and people and data for workflow management |
WebScript -- the evolving Middleware of scripted languages including extended PERL5, Java, Telescript, MOVIE(NPAC compute oriented script) etc. |
In future one will NOT write software for either
|
Rather one will write software for WebWindows defined as the operating environment for World Wide Web |
WebWindows builds on top of Web Servers and Web Client open interfaces as in
|
Applications written for WebWindows will be portable to all computers running Web Servers or Clients which hide hardware and native O/S specifics |
Further WebWindows Software will be modular and allow plug and play insertion of capabilities developed around the Web World -- not a bunch of isolated stovepipe solutions
|
As an example NPAC's WebTools implements UNIX shell/PC file manager capabilities in terms CGI scripts -- allows universal access to these capabilities including powerful Web based mh mail |
NPAC's WebFoil is HotJava Open replacement for Powerpoint/Persuasion |
Particular Application areas (Business, Healthcare, Education) will be built on top of generic NII services so that for instance
|
From foilset WebTools (Spring '95) |
Associated Foil can be found |
Postscript also Available |
Like UNIX or MS-DOS or Windows 3.1(NT,95), WebWindows is an operating system for a "computer" |
The "computer" is a metacomputer consisting of the 50,000 Webservers (currently--eventually hundreds of millions) on Internet for the World Wide Web |
WebWindows can also be used for the metacomputer (collection of heterogeneous networked computers) which is a business enterprise system
|
WebWindows is a multi-client multi-server technology
|
It does not provide multi-threading/multiu-user support, memory management, device drivers and such base services -- these are supplied by UNIX, Windows or Mac O/S |
Rather it provides equivalent of higher level O/S services such as available under UNIX shell or applications supplied under Windows |
In the future one will build applications for WebWindows not UNIX / PC windows etc. |
Very interesting is WebWindows version of Lotus Notes to support Business Enterprise systems -- build from Web components such as those prototyped in WebTools
|
In future one will NOT write software for either
|
Rather one will write software for WebWindows defined as the operating environment for World Wide Web |
WebWindows builds on top of Web Servers and Web Client open interfaces as in
|
Applications written for WebWindows will be portable to all computers running Web Servers or Clients
|
It does not provide multi-threading/multiu-user support, memory management, device drivers and such base services -- these are supplied by UNIX, Windows or Mac O/S |
Rather it provides equivalent of higher level O/S services such as available under UNIX shell or applications supplied under Windows |
In the future one will build applications for WebWindows not UNIX / PC windows etc. |
Very interesting is WebWindows version of Lotus Notes to support Business Enterprise systems -- build from Web components such as those prototyped in WebTools
|
Persuasion and Powerpoint are rather similar monolithic packages which can for instance only be clumsily ported to UNIX as cannot access internal data-structures defining foils |
WebFoil (NPAC prototype WebWindows presentation package) has |
Extended open HTML source manipulated by powerful PERL5 scripts allowing global changes and linkages of foils from many sources
|
WebFoil Uses Hotjava to display HTML with full Web Power including applets to enable Multimedia and dynamic presentations |
Initial webfoil 0.1 release Halloween 1995 |
The WebTop Productivity environment will be built in a more modular fashion than current PC Windows or Macintosh arena
|
Java is key to understanding how WebWindows application/service software will look as it allows balanced client server applications to be built |
Note require an open display software so can produce appropriate customized interfaces for browsing, presenting, word processing etc. |
Java may or may not be accepted by Web Community and Sun/Netscape may or may not allow it to used openly |
However the concept is essential and roughly right -- one or more such open technologies will become available and used on the Web |
Initial webfoil 0.1 release Halloween 1995 |
Will Windows NT take over the world and swamp UNIX?
|
The WebWindows concept says that NT versus UNIX isn't the key issue -- rather most software will not be written for NT, UNIX, MVS, VMS etc but rather to "Web Interfaces" |
One can expect that a new class of optimized operating systems will be developed that are designed solely to support web interfaces and web technology
|
Timing of these trends is unclear and could be critical |
Here each letter N S U O O represents a module |
Each green O is a separate "plug-in" or module or applet enhancing client |
Each yellow O is a a CGI PERL (or Java in future) server side enhancement |
The set of N's represent a monolithic client with many bundled capabilities |
The set of S's represent a monolithic (HTTP) server |
The set of U represents a monolithic (UNIX) operating system |
In future the set of Green O's represent a modular client side system including customizable modular browser |
There is an unclear server-client boundary as model is in fact server-server |
Now the yellow O's represent a corresponding modular server |
Supported by a new "WebUnix" or "WebNT" operating system optimized to support Web technology and interfaces |
Users ONLY talk to Web Clients and Servers |