Search module

Document root Search Engine Support

Overview
The Search modules provides simple document indexing and searching facilities for your web site. It is not intended as a replacement for dedicated indexing software, however it does provide fast and efficient search capabilities. The Search module generates an index of all the text documents in your document root, then uses the index to fulfill search requests.

Configuration
To configure the search module you need specify a URL for accepting searches and a location in the for the Index file. The URL will be used to submit queries to the search module. The index file can be stored anywhere that is convenient, providing it can be read by the user id which the web server runs.

After you enable the Search module you will need to generate an index file for you document root. This is currently achieved by running a command line program. You will need to regenerate the index file each time you modify your document root. The command line program can be installed on a UNIX crontab to automate this process. The index program can be run as follows:

$ZEUSHOME/web/bin/searchindex virtual-server-name
Where virtual-server-name is the name of the Virtual Server to index, and $ZEUSHOME is the installation directory Zeus Server.
NOTE: The searchindex program must be run after the Virtual Server has been started.
Advanced usage
   searchindex [<Optional switches>] <Virtual Server name> [regex]
   searchindex [<Optional switches>] <Virtual Server name> <docroot> <outputfile> [regex]
The latter form allows for multiple search indexes per web site.
The regex, if supplied, is an extended regular expression specifying which filenames to skip. For example:
   searchindex mysite "(/[Ff]rench/|/private/|\.fhtml$)"
If <Virtual Server name> is "-", the configuration file for the Virtual Server is read from stdin instead of looking for the it in the $ZEUSHOME/web/runningsites/ directory.

This allows searchindex to be run on an arbitrary machine (instead of the web server). It also allows the searchindex program to index a non-running web site. The master copy of config files of running web sites live on the adminserver machine in the $ZEUSHOME/webadmin/conf/runningsites/ directory; the config files for non-running web sites live in $ZEUSHOME/webadmin/conf/sites/ on the same machine.

The optional switches are detailed below:

--plain-type
USAGE: --plain-type=<foo/bar>
This switch adds a new mime-type to the list of types which the searchindex program considers to be synonymous with text/plain.
--plain-ext
USAGE: --plain-ext=baz
This switch adda a new file extension to the list of extensions which the searchindex program considers to be of the type text/plain.
--html-type
USAGE: --html-type=<foo/bar>
This switch adds a new mime-type to the list of types which the searchindex program considers to be synonymous with text/html. This means that searchindex will strip HTML-style tags from the text.
--html-ext
USAGE: --html-ext=baz
This switch adda a new file extension to the list of extensions which the searchindex program considers to be of the type text/html. This means that searchindex will strip HTML-style tags from the text.

Searches can be performed by accessing the URL specified, or by including the following form details in your own search page.

<form method="GET" action="/search"> Query: <input type="text" name="expr" size="20"> <input type="submit" value="Search"> </form>
The action attribute (set to "/search" in the above HTML) should be set to the URL you defined in the configuration page. The size value in the input field can be set to any convenient value