Sunday, September 24, 2006

CDK Bug Squash Party - Day 5

Day 5 was formally the last day (see also the summaries of day 1, day 2 and day 3/4) of the Chemistry Development Kit Bug Squash Party (BSP). Miguel uploaded the last bits of his CDK PDBPolymer to CML to CDK PDBPolymer roundtripping functionality (closing a bug and a feature request in one go). Have not tested this first hand yet, but looking forward to playing with this bit of code. Kia continued to work on the more difficult bits of the code refactoring, resulting in fewer though more comprehensive commits. Stefan fixed another bug in JChemPaint; the rendering of implicit hydrogens.

About the last, the Renderer2D needs a serious overhaul. That is, a complete rewrite in proper Java2D, which can use affine transformations for zooming, scaling and fixing the coordinate system. The current code is ancient and predates Java2D. Rich' code might be a good starting point. I would love to do this rewrite, but lack the resources... anyone in need of some open source fame?

I worked on atom typing, which is yet largely untested, and often integrated with other bits of code. Yesterday I uploaded some first patches which I wrote on the train ride back to the Netherlands.

Now, what can be concluded from this BSP? The participant count was below what I had hoped for, but those who did worked hard (and with pleasure I hope :) The total number of JUnit test has increased:
And so has the number of failing tests:

These plots were made with R from data created with
two custom scripts both found in cdk/tools: and extractBugCountPlotData.bsh. Note that 96.86% of the tests do not fail!

The bump in failing tests seems to be due to commit 7010-7011, which has to do with SMILES parsing. Yes, the bond order resolving is still not solved. I don't seem to get Todd's patch for this working, but not giving up either. The bump is so large, because quite some JUnit tests use the SmilesParser as a quick tool to get a configured connection table. However, these tests should be replaced by explicit CDK models, which is easy done with the CDKSourceCodeWriter. I'll blog about how to use that soon.