Given by Yuping Zhu at NCSA PET Webmaster Meeting and ITEA HPCC Tutorial Aberdeen Md. on Feb3, July 13 98. Foils prepared July 9 1998
Outside Index
Summary of Material
We summarize 3 NPAC Projects using Web-linked Oracle databases |
Robot for Web Search |
Web Site Management System |
Technical Report Database |
Outside Index Summary of Material
Full - Text Web Search System |
Web Site Management System |
Yuping Zhu |
Northeast Parallel Architecture Center |
Syracuse University |
The growth of Internet |
Web Space grows dramatically |
Users locate the document quickly |
Web-masters get web's information |
Centralize the documents |
Oracle 7.3.2 |
Oracle ConText Option |
Oracle Web Server |
TCP/IP and HTTP (CGI, HTML) |
Shared memory |
Semaphore manage |
Gather Subsystem: Robot |
Gather, Inspector, Loader, Agent |
Index Subsystem |
Search Subsystem |
Search package, Web interfaces |
Robot Manager |
Web Information package |
Specific Web Server |
http://apollo.wes.hpc.mil |
Specific Directory |
http://www.npac.syr.edu/users/gcf |
Several Servers under An Organization |
http://*.wes.army.mil |
Specific Technology Domain |
Grid Generation Search Engine |
Summary |
Total page and size, longest and shortest page |
size, dead links, discarded links, covered web |
servers number |
Structure |
Page's location in file system on Web Server |
based-on gathered data |
Show dead / discarded links |
Access Control |
User ID, password (can be changed) |
Configure Robot |
Set directory, DB ID/password, Max depth, |
Web site domain, agent |
Check Robot On-line Status |
Gathered pages, discarded pages, dead links, |
covered servers, waiting pages |
Start URLs |
Add new start URLs, check old start URLs |
Introduction |
This system is intended to facilitate the |
day-to-day operations and maintenance |
of Web Site. It divides documents on |
Web Site into three types -- "prototype", |
"production" and "archive". It also provides |
security session and grants different user |
with hierarchical privilege. |
Access Control |
System, manager, developer or public user |
Add / delete / modify user |
Manage document |
Insert, delete, change status, update, review, |
archive, print |
Search |
Public, internal or both |
User change password / time-out |
Attribute |
. Author |
. User's information |
. Submit date |
. Last modification date |
. Expiration date |
. Reviewer and review date |
. Document status |
Text |
Either in DB or external store |