HELP! * YELLOW=global GREY=local Global HTML version of Foils prepared 22 January 1996

Foil 23 Challenges and Issues

From Web Technology Overview CPS616 Basic Information Track for Computational Science -- Winter-Spring Semester 96. by Geoffrey Fox * See also color IMAGE

Data Volume
  • Estimated Web total text size: 0.1 - 1 Terabytes, 5 - 10 million documents (this estimation is based on text size on NPAC web server: 110 MB text, 36,000 text URLs, avg. 3K/page) - grows daily
  • Requires more sophisticated search mechanism than browsing and organizing in hyperlinks
Data Diversity
  • WWW - a gigantic distributed database with unstructured, non-relational and hierarchical (multimedia) information entities with various data formats: MIME -- html, plain text, PostScript, LaTex, etc.
  • Web repositories are heterogeneous, inconsistent and incomplete.
User Base
  • Different requirements in query patterns, search topics and response time
  • Rapid growth in number and search requests daily


Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Tue Feb 18 1997