Introduction to WebWisdom Search Engine

The search engine accesses files stored on the NPAC server. This is implemented by an Oracle database which contains an indexed list of all text documents on the server. Different search pages correspond to different selection of files and different capabilities for selecting options. In the simplest interfaces, the engine looks at text field and selects all documents that conatain All of the specified (blank separated) words. The search is case insensitive. Note that in default case the blank separated keywords are NOT required to be adjacent in the document so that a search for Java Virtual Machine will find all documents that that have anywhere in them the words Java, Virtual and Machine in any case. The text below describes how to use the options given in some of the search interfaces!

There are several file sets which are offered by the different Interfaces


Query Keywords

Search string can be a sequence of keywords, each keyword being separated by one or more blank spaces. All keywords will be treated as case-insensitive. The operator All words/Any word is used to compose the keyword for the search condition of your query.

For example, if you enter parallel systems and choose All words (by default), the query will search for all documents containing both parallel and systems, provided that the matching option is selected as Match (by default).

Options

Word search options allow the keyword(s) you entered to be combined and extended.

Match/Not match

This option searches all documents that either Match or Not match your keywords (and extensions, if any).

Keyword Typed

The original keyword(s) you entered in text field will be directly used in the search. They are not extended.

Forward Stemming

This option matches all the strings which has the keyword at it's beginning. E.g.: implement will be stemmed to implement, implementation, implemented

Reverse Stemming

This option matches all the strings which has the keyword at it's end. E.g: ement will be stemmed to element,implement,improvement,etc.

Fuzzy Match

This is designed to pick out typical data input errors such as mistypes and phonetical misspellings. It will then perform the search on the 6 closest matched words. E.g.:Smith will be expanded to Smit,Smiths,Smooth,etc.

Soundex Expansion

This option attemps to match words for which you do not know the correct spelling, to words that have been entered into the database with a similar sound.

Hits/Page

The number of documents to be shown per page can be set using the Hits/Page option. The total number of documents that matched the query is also displayed. There will be Next and previous to allow you to navigate the query results in pages.


This system is developed by Gang Cheng, Tom Pulikal and Piotr Sokilowski at NPAC. Please send comment to gcheng@npac.syr.edu or tapulika@npac.syr.edu.