HELP! * GREY=local LOCAL HTML version of Foils prepared December 9 1996

Foil 19 Web Search Engines

From Introduction to the World Wide Web and Web Technologies presentation: Introduction to the www and Web Technologies -- Fall Semester 96. by Nancy McCracken * See also color IMAGE

Search Engines enable users to look up text documents stored on the Web, usually by one or more keywords appearing in the document.
Information gathering and filtering
  • This is done by web ÒrobotsÓ - programs which automatically connect to all servers and search some number of documents - usually up to a certain ÒdepthÓ of links, such as 4.
  • For each document, the robot returns keywords and other information to the search index. For example, Lycos returns: the title, any headings and subheadings, the 100 mostÓweightyÓ words, the first 20 lines, the size in bytes, and the number of words.
  • Problems with information gathering:
    • Information update
    • Information resulting from CGI scripts is not available.
    • Resource intensive: robots repeatedly connect to a site, informal protocols try to prevent Òrapid fireÓ or Òrobot attackÓ
    • Preventing robot loops when links are circular.


Local Full HTML * Separate IMAGE foil in Local Presentation
Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Fri Dec 6 1996