Debugging is as much a part of software development as is software design, programming or testing. But there are bugs that leave us scratching our heads and make us look under the hood of all our code components for the elusive cause. And sometimes we get lucky and find not only the bug but also new ways of making the code more efficient. We had one such instance happening during this project "New Project: Multi-class Classification & Trading Strategies".
The crash was a terrible one and was happening during Ensemble Deployment to Excel using the new math functions: Excel would crash, GeneXproTools would crash and we had to take them both down with the Task Manager. It only happened with model ensembles: when the models were deployed individually, the system wouldn't crash. Also interesting was that it only happened with unigenic systems using the new math functions with discrete outputs like {-1, 0, 1} or {0, 1, 2}; outputs like {3, 4, 5} worked just fine. And curiouser and curiouser, the crash disappeared if we added a neutral gene to all the models of the ensemble before deploying to Excel!
What the heck was going on? It couldn't be the Excel VBA code, right? Otherwise the individual models would also have crashed! And the same for the ensembles with a neutral gene! But they don't crash! So the culprit must be somewhere else. But where? (That's where out of desperation we went looking for it where it couldn't possibly be and found new ways of improving our code. And in this case we found a new way of speeding up the deployment of ensembles and models to Excel by optimizing the way Excel calculations were done.)
Well, after much head scratching we finally found the culprit: Minus zero in Excel calculations! A bug not even acknowledged by Microsoft (they say it's fixed and solved as it's just a minor inconvenience for anyone formatting cell colors according to the sign; see the discussion here).
But there's another problem with negative zero in Excel and that's the crash it causes in median calculations! Excel crashes completely and you have to close it down with the Task Manager. And it happens every time. I'm just surprised such a bug exists in Excel, but maybe medians are not as widely used as one would have thought.
Well, we could not wait for Microsoft to solve the problem of minus zero in their median algorithm (and elsewhere) and had to make a decision: either remove the calculations for the Median Model in the Excel worksheet (GeneXproTools evaluates the Median Model and the Average Model for the model ensembles deployed to Excel) or find a workaround to fix the minus zero problem in our Excel code.
We decided on the workaround as it does not corrupt the code much and solves the problem: So we now add zero to the model output in all Excel calculations. It looks a little silly if you don't know why it's there, but it prevents the crash of the median calculations in Excel with minus zero.
Comments
There are currently no comments on this article.
Comment
your_ip_is_blacklisted_by sbl.spamhaus.org