Data Volume
Estimated Web total text size: 0.1 - 1 Terabytes, 5 - 10 million documents (this estimation is based on text size on NPAC web server: 110 MB text, 36,000 text URLs, avg. 3K/page) - grows daily
Requires more sophisticated search mechanism than browsing and organizing in hyperlinks
Data Diversity
WWW - a gigantic distributed database with unstructured, non-relational and hierarchical (multimedia) information entities with various data formats: MIME -- html, plain text, PostScript, LaTex, etc.
Web repositories are heterogeneous, inconsistent and incomplete.
User Base
Different requirements in query patterns, search topics and response time
Rapid growth in number and search requests daily