Gamma Test Distribution +=======================+ These programs implement the Gamma Test and Meta Back-Prop and are based on work by Adalbjorn Stefansson, Nenad Koncar and Professor Antonia Jones (copyright 1997). Please send all comments or bug reports to "Nick.Green@cs.cf.ac.uk" or "S.Margetts@cs.cf.ac.uk" The files in these distributions are listed below (note that only the relevant files are included in each distribution): gamma UNIX (SUN4) version gamma32.exe PC Win95 version gammados.exe PC DOS version gamma.src An example script file explaining the various options mbp UNIX (SUN4) Meta Back-Prop mbp32.exe PC Win95 version mbpdos.exe PC DOS version net UNIX (SUN4) network interpreter net32.exe PC Win95 version netdos.exe PC DOS version hen100.asc Sample test data used by "gamma.src" and "mbp.src" README.TXT This file Please note that the DOS versions (eg "gammados.exe") are limited to 640K, and so may have problems with large input files. If at all possible, please use the Win95 versions. All programs work from the command line. This means that to run them from Win95, you must first open a DOS window. Running any program without a script file will display the version number, copyright message and a summary of all the options and their default values. As an example, the following command line will run the example script for the Gamma Test: "gamma32 gamma.src" Gamma Test +==========+ The Gamma test is designed to give a data derived estimate of the Mean Squared Error of ANY smooth data model from input (vector) to output (scalar). This is that part of the variance in the output value which cannot be accounted for by a smooth model from input to output. Given sufficient data points the estimate returned by the Gamma test can be used for example to determine the best performance one can hope to get from a neural network (which is not overtrained) without undergoing numerous attempts at training. Let the data is represented by a set of input-output pairs, say (X, y), where (X) could be a vector of several inputs. The Gamma Test measures the MSE of the noise (R) of the input-output mapping, by assuming that "y = f(X) + R", where (f) is some unknown smooth function of (X) As a rule of thumb, low Gamma values (the intercept of the regression plot) are "good", as they indicate that the data has a low level of noise. Negative Gamma values are better, and can be taken to indicate very small levels of noise. The slope of the regression plot gives some guide to the complexity of (f), the input-output mapping. The software can also be used to find the set of inputs which best describe the output - a process sometimes called feature selection. This is done by searching through the space of possible input "masks" or "embeddings". The software implements several methods for finding the best embedding; see the example script file "gamma.src" for examples and descriptions. This file also lists the various options and gives explanations of their usage. Meta Back-Prop +==============+ This program implements a neural-network training scheme which uses the information from the Gamma test to guide both the training process and the architecture of the neural network. It works in much the same manner as the Gamma Test program, except the output is a trained neural network. The neural network is comprised of a number of "hills" (so called because they are intended to form a hill-shaped lump in the input-output mapping). The number of hills chosen is guided by the slope of the regression line as calculated by the Gamma test. Each hill is a collection of hidden nodes and a single output node, where each layer has a bias neuron attached. Most of the options are similar to those used by the Gamma test. All are described in more detail in the file "mbp.src". The program outputs the trained network to a file (specified in the script file). This network file can then be tested or used by using the network interpreter program "net". This program also works from the command line, taking inputs from the standard input and writing the result to the standard output. Example: "net test.net" This will load the network described by "test.net" and wait for inputs. Note that the network description includes the mask it was trained with; any inputs not selected by the mask still must be entered on the command line. The input/output redirection symbols "<" and ">" can be useful if using the network interpreter to test many points.