Given by Gang Cheng, C.W. Ou, Geoffrey C. Fox at CPS616 Basic Information Track for Computational Science on Winter-Spring Semester 96. Foils prepared 4 April 1996
Abstract * Foil Index for this file
See also color IMAGE
The Strengths, Weaknesses and Synergy of Web and Database Technologies |
Architectures of Web and Oracle RDBMS Integration |
A Technical Overview of The Oracle-Web Integration |
using wowstub and PL/SQL |
Examples of a Web-based Search Interface for the Phone List Database |
This table of Contents Abstract
Instructor: Geoffrey Fox |
Version 3 April 96 |
teamed with Gang Cheng, Chao-Wei Ou |
Syracuse University |
111 College Place |
Syracuse |
New York 13244-4100 |
The Strengths, Weaknesses and Synergy of Web and Database Technologies |
Architectures of Web and Oracle RDBMS Integration |
A Technical Overview of The Oracle-Web Integration |
using wowstub and PL/SQL |
Examples of a Web-based Search Interface for the Phone List Database |
Characteristics
|
main WWW components (before Java, VRML, LiveMedia etc.):
|
Established Internet as the major vehicle in networking industry |
Universal, hyperlinked information access and dissemination |
Transparent networking navigation and GUI with multimedia information access for information dissemination--- a killer networking application |
Static, browser-oriented client (unless Java or JavaScript enabled) |
Tedious to manage and update (HTML) files on the server side difficult to keep pages of data updated automatically --- modification usually done by a human using word processor or editing tools |
A plain UNIX file system as "database" only supports primitive functions such as open,read/write and close |
More powerful operations on files are required for an information system which are supposed to provide information instead of raw data. |
e.g. need: low-level data manipulation operators such as those provided in RDBMS: select, insert, update, commit, rollback,etc |
Meaningful information needs to be generated on-the-fly from raw data |
Characteristics
|
main RDBMS components
|
Strengths
|
Weaknesses
|
Database techniques used in Web technology
|
Networking techniques used in database technology
|
Web server integrated with database is enhanced with:
|
Database server linked to web server is enhanced with:
|
A powerful backend text searching engine supporting complex queries an efficient, reliable and sophisticated data management system supporting representation and organization of logical relationships among information entities |
A dynamic data processing engine, provided with opportunity for optimized searching performance in large information systems |
A time-dimension added HTML programming engine which displays latest available data. |
A transparent networking interface for remote database server access |
A universal GUI-based search interface for hypertext-added data retrieval |
A truly globally-shared data system for maximum information dissemination
|
An integrated multi-media information system |
Mail databases: internal corporate utility
|
Usenet Newsgroups: http://asknpac.npac.syr.edu/
|
Education databases
|
Health care: demo patient record database |
Oracle SQL*TextRetrieval full text search of 3 online books |
Corporate product databases (under development) |
Education
|
Research
|
Industry
|
Basic features in HTTP protocol -- a client-server model:
|
The Web is Stateless
|
Fill-out forms are used for the server to accept clients run-time input from a HTML page |
Pre-edited HTML files (links) ---
|
CGI scripts --- programs (shell,perl,programming languages) to
|
CGI is Interface for running external programs on a Web server
|
SQL is the standard way to extract(query) data out from the database
|
Most commonly used one: Embedded SQL in C but we discussed PL/SQL and oraperl |
SQL programs: static (compiled) SQL and dynamic (interpreted) SQL |
Clients send SQL statements to server ---> |
Server parses, executes the SQL against the current database ---> |
returned rows are buffered in server's core memory ---> |
Server fetches buffered rows and sends them to clients |
Embedded SQL in C + CGI (C program) |
Form-based relational querying and text searching |
Different approaches such as PL/SQL or oraperl for different vendor's RDBMS systems (Sybase,Oracle,Informix, DB2, etc) |
More details can be found at "http://greatwall.npac.syr.edu:1963/PDB/web-db.html" |
Web Server and DB client must be on the same machine |
DB client and DB server can be on different machine and a proprietary networking protocol (e.g., SQL*NET for Oracle7) is used |
Inter-process Comunication (IPC) is used for the Web server to communicate with DB client on the same machine |
Integrates Oracle7 databases with the Web |
Manage Web information effectively |
Provides transaction logic and integrity to Web applications |
Uses PL/SQL stored procedures to build dynamic HTML documents on the fly |
Bring all the benefits of Oracle7 to the Web, including portability, scalability, manageability, and gateway technology |
Facilitates the creation of dynamic HTML documents via PL/SQL |
Provides an easy way to publish existing Oracle data on the World Wide Web |
Web data can be stored in an Oracle DB which will make it much simpler to manage and manipulate |
Provides utilities which take care of the repetitive tasks associated with building a CGI application |
An OCI (Oracle Call Interface) program called wowstub (or owa in Oracle WebServer) which serves as the actual CGI program |
Two PL/SQL packages (HTP and HTF) that contain utilities to automatically generate HTML tags |
(optional to Oracle WebServer)
|
Data and SQL Engine
|
Procedural and programmable mechanism |
Use PL/SQL for data manipulation and processing. PL/SQL:
|
Performance: tends to execute faster than regular SQL code becuse it doesn't need to be parsed and interpreted before executing |
Portable across all Oracle platforms: resides completely inside the database thus is 100% portable to any platform that Oracle7 runs on |
Modularity: has procedure,package etc mechanisms for software development in modules |
Why not? Not portable to other databases! |
Web pages created on the fly to include data dynamically obtained from Oracle7 |
Oracle7 stored Web related information |
Web pages are therefore always up to date |
Typical URL - http://www.npac.syr.edu/cgi-bin/wow/mypkg.myplsql |
http - specifies protocol to use |
www.npac.syr.edu - specifies hostname |
cgi-bin/wow - specific the OCI program that needs to be executed |
mypkg - specifies the PL/SQL package that wow will invoke |
myplsql - specifies the PL/SQL procedure that wow will invoke |
Extract the PL/SQL package and procedure name embedded in URL and log on to the DB |
Set up the parameters passed from the Web server so that they are accessible to the PL/SQL procedure |
Execute the PL/SQL procedure which extracts data from DB and formats it into an HTML document stored in a PL/SQL table |
Take the HTML document and write it to standard output |
utility packages that simplify the generation of HTML documents from within PL/SQL (HTP and HTF packages) |
each HTML tag has a corresponding function and procedure |
the procedure generates a line of output in a PL/SQL table |
when the entire document is done, the OCI program (wowstub) takes the contents of the PL/SQL table and writes it to standard output |
HyperText Procedures (HTP)
|
HTF (HyperText Functions) is the corresponding package which contains PL/SQL functions for generating HTML tags |
i.e. it returns HTML string whereas HTP writes HTML
|
Details about syntax and use of each procedure and function in HTP and HTF can be found from the handout |
A PL/SQL package called "example" defined by a Oracle user 'cps616' with password 'xyz'.
|
Assume a host machine runs a Web server (http://myhost) and a Oracle RDBMS server (ORACLE_SID = myoracle_server, ORACLE_HOME=/usr/oracle) |
The user 'cps616' has a CGI path as http://myhost/cgi-bin/cps616 in which a CGI script named 'wow' is set up as on following foil: |
#!/bin/csh |
# Specify which Oracle server to use |
setenv ORACLE_HOME /usr/oracle |
setenv ORACLE_SID nodb |
# specify Oracle user and password |
setenv WOW_UID 'cps616/xyz' |
# point to where the Web Agent executable is located |
/usr/oracle/wow-src/wowstub $1 |
exit |
CREATE PACKAGE EXAMPLE is procedure get_phoneno_by_name(name IN VARCHAR2); -- only one procedure |
END EXAMPLE; |
CREATE PACKAGE BODY EXAMPLE is
|
BEGIN
|
/* print out the query result */ |
htp.p(htp.item||'The ' || ptype ||' phone no. of ' || fname || ' ' |
|| lname || ': ' || phone); |
END LOOP; |
CLOSE person_cur; -- close the cursor after it is done |
htp.ulistClose; |
END get_person_by_name; |
END EXAMPLE; |
A HTML form page to accept user search input and invoke the CGI script on the web server to access the database
|
Query results are displayed in another Web page as (if user typed in 'geoffrey') |
<html><header> |
<title>Query Results</title></header> |
<body><h1>Query Results</h1> |
<ul> |
<li>The office phone no. of Geoffrey Fox: 3154434889 |
<li>The home phone no. of Geoffrey Fox: 3154238422 |
</ul> |
</body> |
</html> |
Details of the phone-list example can be found from the handout or at http://pacman.npac.syr.edu:1996 |
Further application examples can be found at
|
Overview - The Oracle WebServer is a HyperText Transfer Protocol (HTTP) Internet Server with unprecedented database integration and a powerful development environment. |
The Web Listener - This is the portion of the WebServer that interfaces to the local network or the World Wide Web. |
The Secure Sockets Layer - Use's SSL with public key encryption. |
The Web Server Manager - The set of Web pages you can use to perform most WebServer administration functions. |
The CGI Interface - The standard Web mechanism for executing applications on a Web server. |
The Web Request Broker (WRB) - The core of the WebServer. An asynchronous request broker with an open API that Oracle WebServer uses to execute applications on the server. |
The PL/SQL Agent - The program the Oracle WebServer uses to execute procedures written in PL/SQL, Oracle's application development language, on the Oracle7 Server. |
Java - The language for developing distributed network applications. Oracle WebServer enables you to execute Java either on the WebServer itself or on the client's browser. |
LiveHTML - A way to embed dynamic content in Web pages. This content can be either other Web pages or the output of scripts run by the Operating System. LiveHTML is an Oracle extension of the NCSA standard Server Side Includes functionality. |
implemented using wow |
another Web interface to MH mailbox (similar to xmh but under web) |
a enhanced backend mail archives server with advanced search capability |
basic components
|
Note: field definition of each table is not shown here; |
Inserting a parsed MH mail into the database will affect the tables: |
folder,msg header,msg body and people |
home page --- generated by a PL/SQL procedure which first stores the client's information of the current connection to a 'visitor' table and then generates links of query options. clients history can be easily pull over from the database. (fig. 1) |
query the mailbox --- find out the summary information about the mailbox (fig. 2) |
query the folder --- choose a folder from all folders currently available in the database (fig. 3), and find out summary information of the chosen folder (fig. 4) |
All the above queries are summary information which are meta-data (accumulated or statistic data about basic tables) and produced at the time when clients requested (done by simple SQL calculation statements) |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Client chooses all folders or a folder currently available in the database (fig. 5) |
browse all mail headers (show msgid,date,sender,size in bytes and subject) (fig. 6) |
show chosen mail's body after the mail header browsing (fig. 14) |
optionally show the complete original MH mail in UNIX mailbox (fig. 15) |
Note that full table scan only on the msg header table is required to list all the mail header in a folder (msg body is not touched during the header browsing). |
This is Optimized browsing performance, compared to MH header browsing where header and body are stored in a single file and are not indexed |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Most recent five mails currently in the db |
mail related to now (the exact time as clients click the query-by-date page) in all folders:
|
mail related to today (the same day as clients click the query-by-date page) in selected folders:
|
mail at any specified time or within any specified time range. Choose from hour(1-12), am/pm, day(1-31), month(1-12), year (fig. 8) |
system clock will be consulted to form above queries |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Query by a keyword --- return mails containing the entered keyword (non-indexed pattern matching, future option: context searching using Oracle TextServer search engine) (fig. 9, 10) |
the keyword can be a typed in or chosen from a table of previously entered keywords |
search in all folders or a selected folder from currently available folders in the DB |
search keyword occurrence in:
|
search with either case insensitive or sensitive for the keyword |
typein keyword will or will not be added to the DB table for future use or keyword in the current DB table will or will not be removed from the DB |
database insert/delete operations are used to add/remove keywords |
in the table --- the power of remembering clients interactions on the web |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Query all mail sent from a selected sender, based on all senders gathered in the current mail database (fig. 11) |
sender is uniquely identified by his/her userid |
a specified folder can be used to narrow the searching, as in many other query options |
Note the sender list is not pre-entered or edited. It is automatically built when a new mail is being incorporated into the database from the MH mail folder by the parser --- another dynamic feature of web-db integration |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
For security and database consistence reason, userid/password at the database level (not Mosaic page-level) is enforced to update the database (fig. 12) |
Two kinds of folders: MH folder in UNIX file system and folder table in the database system. They hold the same information but in different places. |
Original MH mail (in MH format) in a folder must be converted into the database tables (in Oracle table format) to enable all the above query options on the Oracle db server |
Assume each database folder must have a unique MH folder in UNIX file system. The database folder is created when the first mail in the UNIX MH folder is being incorporated in the database |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Before incorporating new mail from a UNIX folder, the PL/SQL procedure first finds out current maximum msg id in the corresponding DB folder and only try to parse/incorporate those MH mails with msg id greater than the maximum msg id in DB (fig. 13) |
If the the MH folder contains subfolders, new mails in all subfolders will be considered to incorporate, and this is a recursive process. |
Implemented by both a perl script (for parsing) and PL/SQL procedures |
A time-intensive process as insertions may be occurred on up to four DB tables which are indexed. But this process can be done in batch ! |
Incorporating can be done at the same time when other query options are requested by other clients on the web. The DB server will ensure the query concurrency and data consistency --- a reliable information system ! |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
List of mail headers satisfying the query (show msgid,date,sender,size in bytes and subject) (fig. 6) |
Show chosen mail's body after the mail header browsing (fig. 14) |
Optionally show the complete original MH mail in UNIX mailbox (fig. 15) |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Produced by Gang Cheng April 1995 |
There is a larger Better Quality Image available |
Query the People |
Query By Mail Type |
Query By Date and Sender |
Query By Date and Subject |
Query By Date and Sender and Subject |
Interface the mail database with WebMail tools (CPS600 server) |
Another example using WOW for a similar database can be found at |
http://kayak.npac.syr.edu:1963 under link TravelVenture Demo for "travel agency" application |
Use Oracle Database Server for Well-structured Information Entities |
WWW Interface to access Oracle-based EIS |
Central New York Tourism Image Database -- Living Schoolbook |
Patient Record Database Prototype -- will enhance with customizable Java frontend |
See http://kayak.npac.syr.edu:1963 |
Use Oracle RDBMS Server and Oracle's Context Server Technology as Indexing and Search Engines |
Use High Performance and Intelligent Web agents to gather local and/or remote Web Pages |
Customizable (form based) Web Search Interface |
See example as on-line HTML book search (Solving Problems on Concurrent Processors) |
Search programming/close captioning attached to digital video database (Living Schoolbook) |
Search traditional text based CD-Rom's (NewsBank in Living SchoolBook) |
and search on 30,000 URL's in NPAC Web Site (http://kayak.npac.syr.edu:1963/search |
Oracle RDBMS server as core engine for relational data |
Oracle's Context Server to deal with non-structures full-text data |
Example is mail or searchable USENET Newsgroups archives -- |
see http://asknpac.npac.npac.syr.edu |
Here To,From,Date,Subject are structured fields |
Body of mail message is unstructured |
Migrating many of above projects to SP2 -- currently 4 wide nodes |
Use Parallel Oracle7 on IBM SP2 (V7.3 beta, Parallel Server and query Options) |
i.e. use Parallelism from multiple users and within query |
Multiple Web Servers and Oracle RDBMS Instances supporting single client image for database and search interface |
See http://128.230.3.39:1963 |