NAME

searchindex - Simple website search engine index generator


SYNOPSIS

searchindex [ optional-flags ] vserver-name [ docroot output-file ] [ optional-regex ]


DESCRIPTION

Zeus Technology provide the searchindex tool to create the index files used by the Zeus search module. The search module provides a very simple search engine which can be used to enable you to find information within plain-text and HTML documents on your websites.

The searchindex program does not perform requests to the site in order to fetch the pages, instead it indexes the files on disk. As such, it is not suitable for indexing sites which are dynamically generated.


TYPICAL USAGE

The searchindex program is typically invoked in the following way:

searchindex vserver-name

Whereupon searchindex will load the virtual server from the runningsites directory ($ZEUSHOME/web/runningsites) and then scan it - find all file extensions which match the mime-types text/plain, text/html and text/x-server-parsed-html. searchindex will then scan the docroot defined in the virtual server, produce an index, and store it into the search modules specified location.


OPTIONAL FLAGS

--plain-type

USAGE: --plain-type=foo/bar

Using this flag, you can add another mime-type to the list of types which searchindex considers to be synonymous with text/plain.

--plain-ext

USAGE: --plain-ext=foo

Using this flag, you can add another extensiohn to the list thereof which searchindex considers to be synonymous with text/plain.

--html-type

USAGE: --html-type=foo/bar

Using this flag, you can add another mime-type to the list of types which searchindex considers to be synonymous with text/html.

--html-ext

USAGE: --html-ext=foo

Using this flag, you can add another extensiohn to the list thereof which searchindex considers to be synonymous with text/html.


<STRONG>vserver-name</STRONG>

The searchindex program needs to be told which virtual server to index.

NOTE: this virtual server must be running for the searchindex program to find it.


<STRONG>docroot</STRONG>

The searchindex program can optionally take the document root to scan.

<STRONG>output-file</STRONG>

If you specify a document-root, you must also specify the file to write the index to.


<STRONG>optional-regex</STRONG>

You may optionally provide to the searchindex program, a regular-expression denoting file names to ignore, when scanning the document root for files to index.

For example: \.fr\.html$

The above regex would tell searchindex not to index files which end with '.fr.html'.


SEE ALSO

regex(7)


COPYRIGHT

Copyright (C) 2000-2001 Zeus Technology Limited. All rights reserved.