Basic HTML version of Foils prepared 11 May 1997

Foil 20 Web Search Indexes

From Introduction to World Wide Web (WWW) ECS400 Senior Undergraduate Course -- Spring Semester 97. by Nancy J. McCracken *

Indexing: the information gathered by the robots is organized into an indexing database at the search server.
  • Primarily keyword indexing is currently used - some full text searching is just on single site search engines.
  • Key issue is size of resulting database.
Searching: the indexing database allows (keyword) searches by the user.
  • Queries are formed, some number of most highly ranked results are returned.
User Interface
  • uniform interface for HTTP, FTP, GOPHER, WAIS, Harvest, Lycos
Challenge of WWW search:
  • estimated total size is 30 Gigabytes, 5 million documents (many search engines now take months to crawl the web to update index files.)
  • diversity - huge distributed database, unstructured, non-relational, hierarchical information with many formats.



© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Thu Aug 21 1997