Separate HTML for Basic Foil 19 Web Search Engines

Basic HTML version of Foils prepared 11 May 1997

Foil 19 Web Search Engines

From Introduction to World Wide Web (WWW) ECS400 Senior Undergraduate Course -- Spring Semester 97. by Nancy J. McCracken *

Search Engines enable users to look up text documents stored on the Web, usually by one or more keywords appearing in the document.

Information gathering and filtering

This is done by web "robots" - programs which automatically connect to all servers and search some number of documents - usually up to a certain "depth" of links, such as 4.
For each document, the robot returns keywords and other information to the search index. For example, Lycos returns: the title, any headings and subheadings, the 100 most"weighty" words, the first 20 lines, the size in bytes, and the number of words.
Problems with information gathering:
- Information update
- Information resulting from CGI scripts is not available.
- Resource intensive: robots repeatedly connect to a site, informal protocols try to prevent "rapid fire" or "robot attack"
- Preventing robot loops when links are circular.

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Thu Aug 21 1997