The Report Panel of GeneXproTools 4.0 keeps a summary of all the relevant information of a run,
and is organized into the following headings:
Comments
This heading is usually used to give a small description of the problem, the objectives, the kind of data used, their source, etc. Both the comments and the author's name are introduced by choosing Properties
on the
File Menu.
Statistics – Training
Under this heading are listed the values of all statistical functions used by
GeneXproTools 4.0 to evaluate the accuracy of your model on the training
set: mean squared error,
root mean squared error,
mean absolute error,
relative squared error,
root relative squared
error, relative absolute
error, correlation
coefficient, and R-square.
Also listed under this heading are more general run parameters such as best fitness, maximum fitness, number of calculation errors (they should always be zero, unless the
seed was modified or the training set was changed after the model had been created), and number of outliers (when the fitness function is one of the following:
Relative with SR,
Relative/Hits, Absolute with
SR, and Absolute/Hits).
A note of warning, though: All the parameters listed here refer to the last evaluated model in the
Predictions
Panel, so make sure that they do refer to the model you are interested in (usually the active model, highlighted in yellow in the History heading).
Statistics – Testing
Under this heading are listed the values of all statistical functions used by
GeneXproTools 4.0 to evaluate the accuracy of your model on the testing set, that is, the efficiency of your model at predicting unknown
behavior: mean squared error,
root mean squared error,
mean absolute error,
relative squared error,
root relative squared
error, relative absolute
error, correlation
coefficient, and R-square.
Also listed under this heading are more general run parameters such as best fitness, maximum fitness, number of calculation errors, and number of outliers (when the fitness function is one of the following:
Relative with SR,
Relative/Hits, Absolute with
SR, and Absolute/Hits).
Note again that all the parameters listed here refer to the last evaluated model in the
Predictions Panel, so make sure that they do refer to the model you are interested in (usually the active model, highlighted in yellow in the History heading).
Data
Under this heading are listed the number of independent variables
after the transformation of the time series, the number of samples in the training
data, and the number of records in the time series.
Program Structure
The program size, the number of literals, and a detailed list of
all used variables and respective weights are shown here. This
information refers to the active model highlighted in yellow in the
History heading.
Function Set
Under this heading are listed all the functions used to create your model, together with the symbols
used to represent them in Karva notation, their weights and
arities.
Note that the highest value of arity in your function set determines the structure of each gene, making the length of the
tail t grow according to the equation:
t = h (n-1) + 1
where h is the length of the head and
n is maximum arity.
Settings
Under this heading there are six entries: General Settings, Time
Series Settings, Complexity Increase, Fitness Function, Genetic
Operators, and Numerical Constants. Under General Settings are listed the number of chromosomes, number of genes, head size,
tail size, Dc size, gene size, and linking function. Under Time Series
Settings are listed the embedding dimension, delay time, prediction
mode, and number of testing predictions. Under
Complexity Increase are listed the number of Generations Without
Change, the Number of Tries, and Max. Complexity. The name of the fitness function and all the relevant parameters for its design are given below
the entry Fitness Function.
Under Genetic Operators are listed the rates of all the genetic operators used to create genetic variation. And, finally,
all the parameters relevant to the manipulation of random numerical
constants are listed under Numerical Constants.
Data Sources
Here is registered the path to the original time series used to
create your model so that its origin could be easily identified.
File Path
Under this heading the path to the gep file that was used to save your model is given so that it can be easily traced for future reference.
History
Under this heading are listed all the best-of-generation models, including obviously the best-of-run model, discovered during the evolutionary process. Note, however, that the notation "best-of-run" and "best-of-generation" refers to the performance of these models on the training set, which could be slightly different on the testing set. This is why
GeneXproTools 4.0 allows you to analyze the performance of all these models on the testing set and, if appropriate, choose the best model based on these results.
GeneXproTools 4.0 gives you the possibility of highlighting in yellow one of these models (the active model) and we advise you to use this feature to highlight the best one. Note also that, for a quick reference,
not only the program structure (program size, number of literals,
and list of used variables and respective weights) but also the fitness and the R-square of all these models both on the training and testing sets are also
given, although, for the testing set, they must be evaluated in the History
Panel.
For brevity sake, all the best-of-generation models are represented in the compact Karva notation, but you can translate any one of them into a more canonical representation by activating any
of these models in the History Panel and then translate it into the language of your choice in the
Model Panel.
|