HELP! * YELLOW=global GREY=local Global HTML version of Foils prepared February 11,1996

Foil 46 Challenges and Issues

From IBM Tutorial on Web Technology for HPCC IBM Poughkeepsie -- February 7 1996. by Geoffrey Fox * See also color IMAGE

Data Volume
  • Estimated Web total text size: 0.1 - 1 Terabytes, 5 - 10 million documents (this estimation is based on text size on NPAC web server: 110 MB text, 36,000 text URLs, avg. 3K/page) - grows daily
  • Requires more sophisticated search mechanism than browsing and organizing in hyperlinks
Data Diversity
  • WWW - a gigantic distributed database with unstructured, non-relational and hierarchical (multimedia) information entities with various data formats: MIME -- html, plain text, PostScript, LaTex, etc.
  • Web repositories are heterogeneous, inconsistent and incomplete.
User Base
  • Different requirements in query patterns, search topics and response time
  • Rapid growth in number and search requests daily

Northeast Parallel Architectures Center, Syracuse University,

If you have any comments about this server, send e-mail to

Page produced by wwwfoil on Tue Feb 18 1997