Posted
Comments None

I promised you more 3-output classifier functions like the BUY-SELL-WAIT function of the previous post. Well, designing 3-output classifier functions that worked well was not very hard and I ended up choosing two more to add to the built-in math functions of GeneXproTools.

The criteria for their choice included principles such as simplicity, symmetry, whether or not I could find a neutral gene for them and, last but not least, how they performed on the Iris dataset both as normal functions and linking functions.

Here's the C++ code for the new CL3B and CL3C functions:

CL3B Function:

    if (arg[0] >= 1.0 && arg[1] >= 1.0)
        output = 1.0;
    else
        if (arg[0] <= -1.0 && arg[1] <= -1.0)
            output = -1.0;
        else output = 0.0;
    return output;

CL3C Function:

    if (arg[0] > 0.0 && arg[1] > 0.0)
        output = 1.0;
    else
        if (arg[0] < 0.0 && arg[1] < 0.0)
            output = -1.0;
        else output = 0.0;
    return output;

And by the way, do you have a favorite 3-output classifier function that you'd like to see implemented in GeneXproTools? If you do, now is the time to tell us about it if you want to see it in the next mini-release.

As I said earlier, coming up with good 3-output classifier functions was relatively easy and I thought it would be a piece of cake to design more complex ones with 4-6 outputs. But I was totally wrong and it required a flash of inspiration to save the day. But I'll tell that story in the next post.

Author

Posted
Comments None

The pseudocode for the BUY-SELL-WAIT function as defined in the "Trading Strategy Mining with Gene Expression Programming" paper is shown below:

        IF (the value of buy-tree of day t > 0)
            AND (the value of sell-tree of day t ≤ 0)
        THEN signal(t)←BUY (+1)
        ELSE IF (the value of buy-tree of day t ≤ 0)
            AND (the value of sell-tree of day t > 0)
        THEN signal(t)←SELL (-1)
        ELSE
            signal(t)←WAIT (0)
        END IF

It's a function of 2 arguments with 3 discrete outputs {-1, 0, +1}. And here's the C++ code for the BSW function:

        if (arg[0] > 0.0 && arg[1] <= 0.0)
            return 1.0;
        else
            if (arg[0] <= 0.0 && arg[1] > 0.0)
                return -1.0;
            else return 0.0;

I tried this function in GeneXproTools with the Iris dataset and it worked great, both as a linking function linking 2 expression trees and as part of the function set, both in unigenic and multigenic systems. In unigenic systems it usually ended up at the root of the tree, achieving high accuracy on the Iris data.

So, this is a great function and would have been a great addition to the built-in math functions of GeneXproTools. But unfortunately I could not find a neutral gene for this function (remember, we need to be able to add a neutral gene to an existing program in order to keep this functionality in GeneXproTools)! Maybe some math wizard out there can find one? But anyway, I had to modify the BSW function slightly for it to have a neutral gene so that it could be used as linking function too:

        if (arg[0] > 0.0 && arg[1] < 0.0)
            return 1.0;
        else
            if (arg[0] < 0.0 && arg[1] > 0.0)
                return -1.0;
            else return 0.0;

This function also works great with the Iris dataset both as a linking function connecting 2 trees and in single-gene systems connecting the left and right branches of the tree.

This function will be added to the built-in math functions of GeneXproTools and also to the linking functions of Regression, Classification and Logistic Regression (I don't see much use for it as linking function in Time Series Prediction so we won't implement it there, unless someone thinks it could be useful even there). I'm thinking of representing it as CL3A, which is a more generic name than the catchy BSW name. The "CL" part refers to "Classification"; the "3" indicates the number of discrete outputs; and "A" represents the order. So, yes, there are more functions like this one in store for us. I'll describe them in the next post.

Author

Posted
Comments None

The reading of the paper "Trading Strategy Mining with Gene Expression Programming" [Huang et al. Proceedings of the 2013 International Conference on Applied Mathematics and Computational Methods in Engineering] sparkled a series of ideas that can be implemented quite easily in GeneXproTools in order to assist in the creation of better trading rules:

  1. Create built-in math functions and linking functions of 2 arguments with 3 discrete outputs (-1, 0, +1 or/and 0, 1, 2) such as the BUY-SELL-WAIT function described in the paper.
  2. We can also create more complex functions with 4-6 discrete outputs for more complex trading decisions.
  3. All the 2-argument n-output functions with a neutral gene (as implemented in GeneXproTools) can be used in a manner as described in the paper, that is, using 2 genes, the first for the BUY-tree and the second for the SELL-tree.
  4. Functions without a neutral gene or functions with more than 2 arguments (I'm still talking about functions with 3-6 discrete outputs) can play a similar role in the creation of trading rules when used in single-gene structures. In this case, when the function is at the root of the tree, the left branch determines the BUY signal and the left the SELL signal.
  5. We can also implement new genetic operators in order to have more control over the root position in the trees. This is not essential for evolving trading rules using the strategy outlined in 4, but is a nice touch and has applications in other domains. The genetic operators I'm thinking about are Fixed-Root Mutation and Conservative Fixed-Root Mutation.
  6. Implications for Classification: These new tools can be used in multi-class classification, at least for simple problems, as more complex problems require decomposing the multi-class problem into n different binary classification problems. This new algorithm will be implemented in the Regression Framework using the fitness functions and visualization/analytics tools available for regression. Particularly interesting are the Hits-based fitness functions as they also give access to the Hits/Outliers statistics that can also be used for model selection.

There are a few minor spin-offs of this project, but we'll get to them one at a time.

I hope you traders out there get inspired by the paper too and contribute to this project with your own ideas for implementing even more powerful tools.

So please tune in!

Author

Posted
Comments 1

The creation of GEPSODY has been a dream of mine almost as old as GEP itself. I would dream of putting all my ideas in a blog instead of writing them in my GEP Journal.

But then there was the problem of publishing and I had to keep my ideas to myself before publishing them in papers or in books. But blogging about them months or years after the fact is not just the same as when I was feeling all the enthusiasm and excitement of each new idea, big and small. So the GEPSODY idea continued to live only in my journal.

Then Gepsoft came along and I also started to keep a GeneXproTools Journal with all my ideas for new features and algorithms. But soon enough I no longer bothered to separate the GEP ideas from the GeneXproTools ideas and I would mark them !!GEP!! or !!GXPT!! in my journal.

But again there were restrictions about what could be shared. Now the restrictions existed because I had to wait for the implementation and subsequent release of all my ideas in GeneXproTools. And due to the relatively long development cycles of software, by the time all the ideas and features were implemented and released I was already thinking and working on other new ideas.

So again I had to put GEPSODY aside and commiserate about the fact that I would only be able to ever blog about boring old ideas, transcribing/translating them from my ideas journals when nobody, myself included, cared about them.

But then I had a brilliant idea only a few days ago and it allows me to break free of almost all of these constraints!

It all began when we decided to transition to subscriptions of all our software later in September 2013. The subscriptions system is very dynamic and works best when several mini-releases are launched between main releases, keeping developers and users very engaged. As it turns out, subscriptions create the perfect ground for blogging about fresh new ideas, exactly when they are being implemented. And what's more, much of the old enthusiasm returns because we usually make new discoveries and gain new insights during the implementation process!

So we are excited about starting this new endeavor with you, with a more engaging development cycle punctuated by several mini-releases where we openly blog about the new features and algorithms we are implementing, letting you know why we are so excited about them and giving you also the opportunity to participate in the process through comments to the blog posts or through invited posts.

I hope this will mark the beginning of a truly dynamic and creative relationship between innovators, developers and users!

Stay tuned!

Author

← Older Newer →