Indexing: the information gathered by the robots is organized into an indexing database at the search server.
  • Primarily keyword indexing is currently used - some full text searching is just on single site search engines.
  • Key issue is size of resulting database.
Searching: the indexing database allows (keyword) searches by the user.
  • Queries are formed, some number of most highly ranked results are returned.
User Interface
  • uniform interface for HTTP, FTP, GOPHER, WAIS, Harvest, Lycos
Challenge of WWW search:
  • estimated total size is 30 Gigabytes, 5 million documents (many search engines now take months to crawl the web to update index files.)
  • diversity - huge distributed database, unstructured, non-relational, hierarchical information with many formats.

See also color IMAGE Foil 20 Web Search Indexes

From Introduction to the World Wide Web and Web Technologies presentation: Introduction to the www and Web Technologies -- Fall Semester 96. by Nancy McCracken-Foils prepared December 9 1996


Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Fri Dec 6 1996