Posted
Comments None

The endogenous calculator of GeneXproTools is in C++ and therefore makes sense to show the code for all these new classifier functions in C++. But GeneXproTools translates automatically the code of all the models it generates into 17 math programming languages (Ada, C, C++, C#, Excel VBA, Fortran, Java, JavaScript, Matlab, Octave, Pascal, Perl, PHP, Python, R, Visual Basic, and VB.Net) through the use of built-in grammars (you can in fact translate the model code into virtually any programming language using the Custom Grammars functionality of GeneXproTools).

So, over the next posts I'll show you the code for all these new functions in all the built-in programming languages of GeneXproTools. I'll be using GeneXproTools itself to generate the code automatically using a neat trick to write the code of all the new functions at the same time (more about this trick in a moment, which, by the way, comes in handy to anyone designing their own custom grammars.)

Overall, we've ended up with 39 new math functions! These functions include not only the new classifier functions we've been talking about, but also new Step and Ramp functions of one argument that complement nicely this new set of discrete functions.

The neat trick consists of creating a Karva program by hand (through the Change Seed Window) that includes all the new 39 functions. As an illustration I'm including here a much simpler program with just 3 functions, namely the 3-output classifier functions described in the posts "Function Design: The BUY-SELL-WAIT Function" and "Function Design: More 3-Output Classifier Functions":

from math import *

def gepModel(d):

    y = 0.0

    y = gepCL3A(d[2],d[3])
    y = y + gepCL3B(d[3],d[0])
    y = y + gepCL3C(d[3],d[0])

    return y


def gepCL3A(x, y):
    if ((x > 0.0) and (y < 0.0)):
        return 1.0
    elif ((x < 0.0) and (y > 0.0)):
        return -1.0
    else:
        return 0.0

def gepCL3B(x, y):
    if ((x >= 1.0) and (y >= 1.0)):
        return 1.0
    elif ((x <= -1.0) and (y <= -1.0)):
        return -1.0
    else:
        return 0.0

def gepCL3C(x, y):
    if ((x > 0.0) and (y > 0.0)):
        return 1.0
    elif ((x < 0.0) and (y < 0.0)):
        return -1.0
    else:
        return 0.0

As you can see in this simple Python code, I'm using a different gene to encode a different function. So, here in this simple example, I'm using 3 genes to encode 3 different functions. To show all the 39 new functions I'll have to use a total of 39 such simple genes with a head size of 1. And I can use the same Karva program again and again to generate the code for all these new functions in all the programming languages of GeneXproTools (I had of course to program them first, but now it's easy; this is also useful to check for bugs when you’re creating a Custom Grammar).



Author

Posted
Comments None

Writing can be a very creative process and I've been fortunate enough to experience this several times: with my PhD thesis, papers, books, ideas journals… and now blog posts.

The new elastic classifier functions are an example of flash inspiration that just came to me while I was writing the post "Function Design: New 3-6 Output Functions". I had just finished describing how the mapper functions worked when another new class of functions just came to me almost without conscious effort, and I just kept writing and thinking: "Now I just need to check how they work."

I had to postpone my post and wait for the next day to implement and test these new functions. I ended up designing four new 3-output elastic classifiers, that are elastic versions of the 3-output classifier functions described in the posts "Function Design: The BUY-SELL-WAIT Function" and "Function Design: More 3-Output Classifier Functions".

So, the first elastic classifier function in the series – ECL3A – is a function of 3 arguments and implements an elastic version of the BUY-SELL-WAIT function (implemented as CL3A in GeneXproTools). And its performance is even better than the BUY-SELL-WAIT function, with 98% vs 96%! And here's the C++ code for this new function:

    // ECL3A(x0,x1,x2): 3-Output Elastic Classifier Function
    if (x[1] > x[0] && x[2] < x[0])
        return 1.0;
    else if (x[1] < x[0] && x[2] > x[0])
        return -1.0;
    else return 0.0;

The second function in the series – ECL3B – is the elastic counterpart of the CL3C function. Both these functions perform quite well, both of them with a high hit rate of 98%:

    // ECL3B(x0,x1,x2): 3-Output Elastic Classifier Function
    if (x[1] > x[0] && x[2] > x[0])
        return 1.0;
    else if (x[1] < x[0] && x[2] < x[0])
        return -1.0;
    else return 0.0;

The third function in the series – ECL3C – is an elastic implementation of the CL3B function and it performs slightly worse than the inelastic form, with 95% vs 97%:

    // ECL3C(x0,x1,x2): 3-Output Elastic Classifier Function
    if (x[1] >= x[0] && x[2] >= x[0])
        return 1.0;
    else if (x[1] <= -x[0] && x[2] <= -x[0])
        return -1.0;
    else return 0.0;

And finally, the fourth function in the series – ECL3D – is also an elastic version of the CL3B function, with the difference that it uses the first 2 arguments as anchoring points that work as reference for the mapping. This function performs slightly better than the CL3B function (98% vs 97%) and also better than the ECL3C described above (98% vs 95%):

    // ECL3D(x0,x1,x2,x3): 3-Output Elastic Classifier Function
    // evaluate min(x,y) and max(x,y)
    double min = x[0];
    double max = x[1];
    if (min > x[1])
    {
        min = x[1];
        max = x[0];
    }

    if (x[2] >= max && x[3] >= max)
        return 1.0;
    else if (x[2] <= min && x[3] <= min)
        return -1.0;
    else return 0.0;

Over the next posts I'll start talking about the implementation of all these new math functions in all the programming languages supported by GeneXproTools.

Author

Posted
Comments None

The 3-output mapper functions are the simplest of all the new mapper functions, as they need to define only 3 intervals, one for each discrete output. For this series I chose to map the intervals to {-1, 0, +1} instead of {0, 1, 2} to explore the symmetry around zero.

Again, there are three different functions in this series – Map3A, Map3B, and Map3C – with 2, 3, and 4 arguments, respectively. And their performances are also exceptional: in this case all of them performed exactly the same, with 98% hits, which is also the same hit rate obtained both for the argmin and argmax functions of 3 arguments.

And here's their complete description in C++:

    // Map3A(x0,x1): 3-Output Mapper Function
    const double SLACK = 10.0;
    double output = 0.0;
    if (x[1] < (x[0] - SLACK))
        output = -1.0;
    else if (x[1] > (x[0] + SLACK))
        output = 1.0;
    return output;

 

    // Map3B(x0,x1,x2): 3-Output Mapper Function
    // evaluate min(x,y) and max(x,y)
    double min = x[0];
    double max = x[1];
    if (min > x[1])
    {
        min = x[1];
        max = x[0];
    }

    double output = 0.0;
    if (x[2] < min)
        output = -1.0;
    else if (x[2] > max)
        output = 1.0;
    return output;

 

    // Map3C(x0,x1,x2,x3): 3-Output Mapper Function
    // evaluate min(x,y,z) and max(x,y,z)
    //
    // evaluate min(x,y,z)
    double min = x[0];
    if (min > x[1])
        min = x[1];
    if (min > x[2])
        min = x[2];
    // evaluate max(x,y,z)
    double max = x[0];
    if (max < x[1])
        max = x[1];
    if (max < x[2])
        max = x[2];

    double output = 0.0;
    if (x[3] < min)
        output = -1.0;
    else if (x[3] > max)
        output = 1.0;
    return output;

In the next post I'll describe the other new class of discrete classifier functions: the elastic classifier functions introduced for the first time in the post "Function Design: New 3-6 Output Functions".

Author

Posted
Comments None

Mapper functions with 6 discrete outputs is as high as I'll go (I still have to implement all of them in all the programming languages of GeneXproTools). Although they continue to scale up amazingly well, with hardly any loss in performance compared to the mapper functions of 4 and 5 outputs (see the posts "Function Design: 4-Output Mapper Functions" and "Function Design: 5-Output Mapper Functions"), I don't think we'll benefit from higher order output functions to solve multi-class classification problems with more than 3 classes, unless the problems are really simple such as the 3-class Iris problem.

I must confess that I couldn't find even a toy problem to test effectively these higher order mappers on them. For example, not even the Balance Scale data, a well-known toy problem, can be satisfactorily solved in one go with these functions (and also the argmin/argmax functions of 4 arguments) using the current setup. Other tools are obviously needed, such as special fitness functions, linking structures and sampling schemes, just to name a few.

But these functions are nonetheless interesting and very useful on their own and, meshed with other functions, they can impact positively on the evolution of all kinds of models.

So here it is, the C++ code for the 6-output mapper functions of 2, 3, and 4 arguments. Again, there are 3 new functions in the series – Map6A, Map6B, and Map6C – respectively with 2, 3, and 4 arguments. And their performance is again exceptional with 98% hits for Map6A, 98% for Map6B, and 99% for Map6C against 96% for the argmax.  

    // Map6A(x0,x1): 6-Output Mapper Function
    const double SLACK = 10.0;
    double output = 0.0;
    if (x[1] < (x[0] - SLACK))
        output = 0.0;
    else if (x[1] >= (x[0] - SLACK) && x[1] < (x[0] - SLACK/2.0))
        output = 1.0;
    else if (x[1] >= (x[0] - SLACK/2.0) && x[1] < x[0])
        output = 2.0;
    else if (x[1] >= x[0] && x[1] < (x[0] + SLACK/2.0))
        output = 3.0;
    else if (x[1] >= (x[0] + SLACK/2.0) && x[1] < (x[0] + SLACK))
        output = 4.0;
    else if (x[1] >= (x[0] + SLACK))
        output = 5.0;
    return output;


    // Map6B(x0,x1,x2): 6-Output Mapper Function
    // evaluate min(x,y), max(x,y), midrange, midpoint1, midpoint2
    double min = x[0];
    double max = x[1];
    if (min > x[1])
    {
        min = x[1];
        max = x[0];
    }
    double midrange = (min + max)/2.0;
    double midpoint1 = (min + midrange)/2.0;
    double midpoint2 = (midrange + max)/2.0;
    double output = 0.0;
    if (x[2] < min)
        output = 0.0;
    else if (x[2] >= min && x[2] < midpoint1)
        output = 1.0;
    else if (x[2] >= midpoint1 && x[2] < midrange)
        output = 2.0;
    else if (x[2] >= midrange && x[2] < midpoint2)
        output = 3.0;
    else if (x[2] >= midpoint2 && x[2] < max)
        output = 4.0;
    else if (x[2] >= max)
        output = 5.0;
    return output;

 

    // Map6C(x0,x1,x2,x3): 6-Output Mapper Function
    // evaluate min(x,y,z), max(x,y,z), midleValue(x,y,z), midrange1, midrange2
    //
    // evaluate min(x,y,z)
    double min = x[0];
    int argmin = 0;
    if (min > x[1])
    {
        min = x[1];
        argmin = 1;
    }
    if (min > x[2])
    {
        min = x[2];
        argmin = 2;
    }
    // evaluate max(x,y,z)
    double max = x[0];
    int argmax = 0;
    if (max < x[1])
    {
        max = x[1];
        argmax = 1;
    }
    if (max < x[2])
    {
        max = x[2];
        argmax = 2;
    }
    // evaluate midleValue(x,y,z)
    double midleValue = x[2];
    if (0 != argmin && 0 != argmax)
        midleValue = x[0];
    if (1 != argmin && 1 != argmax)
        midleValue = x[1];
    // evaluate midrange1 e midrange2
    double midrange1 = (min + midleValue)/2.0;
    double midrange2 = (midleValue + max)/2.0;

    double output = 0.0;
    if (x[3] < min)
        output = 0.0;
    else if (x[3] >= min && x[3] < midrange1)
        output = 1.0;
    else if (x[3] >= midrange1 && x[3] < midleValue)
        output = 2.0;
    else if (x[3] >= midleValue && x[3] < midrange2)
        output = 3.0;
    else if (x[3] >= midrange2 && x[3] < max)
        output = 4.0;
    else if (x[3] >= max)
        output = 5.0;
    return output;

In the next post we'll go the other direction and take a look at the 3-output mapper functions of 2, 3, and 4 arguments, the last in the series of new mapper functions.

 

Author

Posted
Comments None

The series of 5-output mapper functions explores the same principles of the 4-output mappers, with the difference that we are now defining 5 intervals instead of just 4. And what's interesting is how well these functions scale up (97% hits for Map5A, 98% for Map5B, 99% for Map5C against 96% for the argmax), behaving very similarly to their counterparts of 4 outputs. Even with just 2 arguments we can create efficient 5-output mapper functions that show a performance comparable to the argmax function of 4 arguments (it would have been interesting to see how they compare to the argmin/argmax functions of 5 arguments, but it will have to wait for the powers that be at Gepsoft to increase max arity to 5 in GeneXproTools)!

And like we saw for the Map4A function described in the previous post, the value for the slack was also critical for the Map5A function, in this case with 15 as the best value (the reason for 15 is that I needed a value that divided neatly by 3; I also tried the values 1.5, 30, 150, and 1500).

As usual, I'm including the C++ code for all three 5-output mapper functions for you to take a look:

    // Map5A(x0,x1): 5-Output Mapper Function
    const double SLACK = 15.0;
    double output = 0.0;
    if (x[1] < (x[0] - SLACK))
        output = 0.0;
    else if (x[1] >= (x[0] - SLACK) && x[1] < (x[0] - SLACK/3.0))
        output = 1.0;
    else if (x[1] >= (x[0] - SLACK/3.0) && x[1] < (x[0] + SLACK/3.0))
        output = 2.0;
    else if (x[1] >= (x[0] + SLACK/3.0) && x[1] < (x[0] + SLACK))
        output = 3.0;
    else if (x[1] >= (x[0] + SLACK))
        output = 4.0;
    return output;

 

    // Map5B(x0,x1,x2): 5-Output Mapper Function
    // evaluate min(x,y), max(x,y), midpoint1 and midpoint2
    double min = x[0];
    double max = x[1];
    if (min > x[1])
    {
        min = x[1];
        max = x[0];
    }
    double intervalLength = (max - min)/3.0;
    double midpoint1 = min + intervalLength;
    double midpoint2 = min + 2.0*intervalLength;

    double output = 0.0;
    if (x[2] < min)
        output = 0.0;
    else if (x[2] >= min && x[2] < midpoint1)
        output = 1.0;
    else if (x[2] >= midpoint1 && x[2] < midpoint2)
        output = 2.0;
    else if (x[2] >= midpoint2 && x[2] < max)
        output = 3.0;
    else if (x[2] >= max)
        output = 4.0;
    return output;

 

    // Map5C(x0,x1,x2,x3): 5-Output Mapper Function
    // evaluate min(x,y,z), max(x,y,z), midleValue(x,y,z), midrange1, midrange2
    //
    // evaluate min(x,y,z)
    double min = x[0];
    int argmin = 0;
    if (min > x[1])
    {
        min = x[1];
        argmin = 1;
    }
    if (min > x[2])
    {
        min = x[2];
        argmin = 2;
    }
    // evaluate max(x,y,z)
    double max = x[0];
    int argmax = 0;
    if (max < x[1])
    {
        max = x[1];
        argmax = 1;
    }
    if (max < x[2])
    {
        max = x[2];
        argmax = 2;
    }
    // evaluate midleValue(x,y,z)
    double midleValue = x[2];
    if (0 != argmin && 0 != argmax)
        midleValue = x[0];
    if (1 != argmin && 1 != argmax)
        midleValue = x[1];
    // evaluate midrange1 and midrange2
    double midrange1 = (min + midleValue)/2.0;
    double midrange2 = (midleValue + max)/2.0;

    double output = 0.0;
    if (x[3] < min)
        output = 0.0;
    else if (x[3] >= min && x[3] < midrange1)
        output = 1.0;
    else if (x[3] >= midrange1 && x[3] < midrange2)
        output = 2.0;
    else if (x[3] >= midrange2 && x[3] < max)
        output = 3.0;
    else if (x[3] >= max)
        output = 4.0;
    return output;

In the next post we'll go a step further and describe the 6-output mapper functions of 2, 3, and 4 arguments.

Author

← Older Newer →