HELP! * YELLOW=global GREY=local Global HTML version of Foils prepared 22 January 1996

Foil 27 The Indexing Subsystem

From Web Technology Overview CPS616 Basic Information Track for Computational Science -- Winter-Spring Semester 96. by Geoffrey Fox * See also color IMAGE

How text of web documents/files are internally stored/indexed in the text database to efficiently and effectively support searching
Common approach - 'inverted index'
Major issues - direct impact on database size and search performance
  • compression scheme to store text and their indexes - minimize space consumption
  • index scheme, tightly coulpled with the search engine - speedup search
  • indexing modes - real-time, batch, or incremental indexing
  • high performance web robot - minimize impact on network traffic and database loading


Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Tue Feb 18 1997