Choosing the Fitness Function

Squared Accuracy
 
GeneXproTools 4.0 implements the Squared Accuracy fitness function both with and without parsimony pressure. The version with parsimony pressure puts a little pressure on the size of the evolving solutions, allowing the discovery of more compact models.

The Squared Accuracy fitness function of GeneXproTools is, as expected, based on the classification accuracy.

The classification accuracy Ai of an individual program i depends on the number of fitness cases correctly classified (true positives plus true negatives) and is evaluated by the formula:

where t is the number of sample cases correctly classified, and n is the total number of sample cases.

The fitness fi of an individual program i is expressed by the equation:

fi = 1000*Ai*Ai

and therefore ranges from 0 to 1000, with 1000 corresponding to the ideal.

Its counterpart with parsimony pressure, uses this fitness measure fi as raw fitness rfi and complements it with a parsimony term.

Thus, in this case, raw maximum fitness rfmax = 1000. And the overall fitness fppi (that is, fitness with parsimony pressure) is evaluated by the formula:

where Si is the size of the program, Smax and Smin represent, respectively, maximum and minimum program sizes and are evaluated by the formulas:

Smax = G (h + t)

Smin = G

where G is the number of genes, and h and t are the head and tail sizes (note that, for simplicity, the linking function was not taken into account). Thus, when rfi = rfmax and Si = Smin (highly improbable, though, as this can only happen for very simple functions as this means that all the sub-ETs are composed of just one node), fppi = fppmax, with fppmax evaluated by the formula:



Home | Contents | Previous  | Next