GeneXproTools 4.0 implements the Relative Absolute
Error (RAE) fitness function both with and
without parsimony pressure. The
version with parsimony
pressure puts a little pressure on the size of the evolving
solutions, allowing the discovery of more compact models.
The RAE fitness function of GeneXproTools 4.0
is, as expected, based on the standard relative absolute error.
The relative absolute error is very similar to the relative squared error
in the sense that it is also relative to a simple
predictor, which is just the average of the actual values. In this case, though, the error is just the total absolute error instead of the total squared error. Thus, the relative absolute error
takes the total absolute error and normalizes it by dividing by the
total absolute error of the simple predictor.
Mathematically, the relative absolute error Ei of an individual program
i is evaluated by the equation:
where P(ij) is the value predicted by
the individual program i for fitness case j (out of n
fitness cases); Tj is the target value for fitness
case j; andis
given by the formula:
For a perfect fit, the numerator is equal to 0 and Ei
= 0. So, the RAE index ranges from 0 to infinity, with 0
corresponding to the ideal.
As it stands, Ei can not be used directly as fitness since, for fitness proportionate selection, the value of fitness must increase with efficiency.
Thus, for evaluating the fitness fi of an individual program
i, the following equation is used:
which obviously ranges from 0 to 1000, with 1000 corresponding to the ideal.
Its counterpart with parsimony pressure, uses this fitness
measure fi
as raw fitness rfi and complements
it with a parsimony term.
Thus, in this case, raw maximum fitness rfmax =
1000.
And the overall fitness fppi (that is, fitness with parsimony pressure) is evaluated by the formula:
where Si is the size of the program, Smax and
Smin represent, respectively, maximum and minimum program sizes and are evaluated by the formulas:
Smax = G (h + t)
Smin = G
where G is the number of genes, and h and t are the head and tail sizes (note that, for simplicity, the linking function was not taken into account). Thus, when
rfi = rfmax and Si =
Smin (highly improbable, though, as this can only happen for very simple functions as this means that all the sub-ETs are composed of just one node),
fppi = fppmax, with fppmax evaluated by the formula:
|